Blog Podcasts

February 24, 2007

mod_rewrite Listen to this article

Filed under: mod_rewrite, Site Redesign & Findability — susan @ 6:17 am

Ok, so back to my site redesign/move issues. In a previous post, I had written that search engines search by domain name, not IP address. This week, I found some contradictory information that claims that some search engines DO search by IP address instead of domain name.

This article was first published in ’03 and updated in April ’04 so it may be outdated. It’s also possible that the major search engines search by domain name, but some of the minor ones search by IP address. I’m not sure it’s worth spending time trying to figure out what’s what since it could change next week. However, a safe bet would be to assume the worst – that they search by IP address – and keep your old site up for a few months while you’re redirecting traffic to your new site. During this time, check your traffic stats to watch for search engine spider activity at both the old and the new sites. Once the major search engines have crawled your new site, you can take the old one down.

The best ways to redirect traffic are either 301 redirects, which I discussed in an earlier post, or mod_rewrite. I tried to find a solid source that could tell me why you would use one of these over the other. I don’t feel I’ve found “the” answer yet, but it seems like if you’re making a lot of page and folder name changes, that mod_rewrite might be easier. However, mod_rewrite is only for servers running Apache, whereas I think 301s can be used with other server software.

That being said, what exactly is mod_rewrite? It’s an Apache module that rewrites a requested URL on the fly. It’s very search engine friendly and while it looks like gobbledygook since it works with regular expressions which always make my eyes cross, there are places all over the web where you can copy and paste what you need. There also seem to be a number of forums for asking mod_rewrite questions if you get stuck. Mod_rewrites are put in the .htaccess file.

One use of mod_rewrite that I saw popping up quite a lot is a concern over Google seeing yoursite.com and www.yoursite.com as two separate URLs. This is referred to as the “Google Canonical problem” and the reason it’s important is that it can split your stats (two pages with 500 hits each won’t rank as high as one page with 1000 hits) and/or give you a duplicate content flag. The mod_rewrite code for making sure Google will see one site instead of two is:

RewriteEngine on
RewriteCond %{HTTP_HOST} !^www.yoursite.com$
RewriteRule ^(.*)$ http://www.yoursite.com/$1 [R=301]