Opera, Apache MultiViews, Content-Location

I recently received an email complaining that the Chords on the guitar link on my chord tutorial page returned a 404 Not Found error.

This confused me greatly—the link is to an anchor within the same document, referenced thus:

<a href="#guitarchords">Chords on the guitar</a>

There is no way that should return a 404 error. Even if the named anchor doesn’t exist within the document, browsers should stay on the same page. I was very confused.

The plot thickens

The complainant also identified the page that was being requested but not found: p.php. As described on my behind-the-scenes page, this site is generated from a template and content stored in a database. All URLs are rewritten with Apache’s mod_rewrite module to refer to p.php. This shouldn’t be visible to the browser, though.

The script in p.php is written to return a 404 on direct requests, as this shouldn’t happen. Try it!

MultiViews and mod_negotiation

To allow me to remove file extensions from my URLs, I’m also using Apache’s MultiViews, from its mod_negotiation module. I knew from reading Brad Choate’s Content-Dislocation article that mod_negotiation forces the HTTP header Content-Location into the output of any requests. Brad presents a method for removing the header when using Apache 1.3.x, but this doesn’t work with Apache 2.x which I use.

Update: this is now fixed in Apache 2.0.51—see below.

As a result, every request to any URL on my site had Content-Location: p.phpc added to it.

Opera

I went back through my logs to find the offending requests, and found that in every case (about fifteen), the browser in question was Opera. A quick Google search showed up Andrew Gregory’s writeup of the Opera behaviour (now updated with the fragment symptom as described here).

So, it’s a known idiosyncracy whereby Opera pays more attention to the Content-Location header than to the original URL requested.

The fix

Previously, I wasn’t too concerned about this header being generated, but now I know it is causing some of my links to fail. A quick workaround would be to replace such links with the required URL, so that the original code would become:

<a href="index#guitarchords">Chord on the guitar</a>

However, this might cause other browsers to re-request the file unnecessarily, and I don’t like modifying my code to make up for differing browser behaviours.

I decided that the best course of action would be to “fix” Apache—I don’t think this header should be compulsory, and I raised this in Apache’s bug system.

I’ve modified my template code to output an appropriate Content-Location header containing the correct URL, so all I need to to is remove the Apache-generated one.

The ability to remove the header has, as of 2.0.51, been back-ported from the development 2.1 branch. There are two fixes to consider, for old and new versions:

Apache 2.0.50 and earlier

To remove this header (which is optional anyway), we need to edit the file mod_negotiation.c in the modules/mappers directory of the Apache source. Comment out the relevant line thus:

/* Prevent issuing of Content-Location
     apr_table_setn(r->err_headers_out, "Content-Location",
           apr_pstrdup(r->pool, variant->file_name));
*/

You’ll then need to rebuild the module: you may have to delete mod_negotiation.so from the hidden .libs subdirectory to get the top-level make to recognize the change.

Install the new mod_negotiation.so module, and restart Apache.

Apache 2.0.51 and later

Simply edit your .htaccess or httpd.conf files to include the line:

Header always unset Content-Location

Obviously, if you edit httpd.conf, you’ll need to restart your server for it to take effect.

This is a less misleading name for the old ErrorHeader directive from Apache 1.3. See the Header documentation.