RSS FeedRSS via email

URL Canonicalization

Is Your Site a Victim of URL Canonicalization?

by Nicolas Prudhon · 45 comments

in Search Engine Optimization

Often in SEO, there are some things you may miss that may ultimately have a negative effect on your site. Nicolas Prudhon goes over that “one thing” you may have overlooked while working on your site’s SEO.

You have been working so hard to optimize your site and yet you may have forgotten one thing. If you are any bit conscientious about the SEO of your site, you probably have a good keyword, a good page title, backlinks and so on.

Did you know that because of the one thing you forgot to do, part of that hard work is going to waste?

Take a few seconds to look at the following URLs:

  • http://www.mydomain.com
  • http://mydomain.com
  • http://www.mydomain.com/index.html
  • http://www.mydomain.com/index.php
  • http://mydomain.com/index.html
  • http://mydomain.com/index.php

If you are like most website out there, it is very likely that those 6 URLs actually return the same page.

As you do further analysis of your pages from the previous links, you’ll see that you may encounter different sets of backlinks and PR.

This is due to URL canonicalization

Indeed, to you and your visitors, all those links may resolve to the same page, but for the search engines, those are 6 very different pages. The decision is then left to the search engine as to which one it should show you.

In most of the cases, the URL returned will be the one with the most inbound and outbound links, thus the process is invisible to the webmaster in most instances.

The main problem about this when it comes to SEO is the simple fact that part of your efforts is leaked onto those different URLs.

In order for you to palliate to this problem, you must instruct the search engines that those URLs are in fact only one.

By doing so you’ll be able to optimize the SEO working efforts.

The process is done very simply by adding a few command lines to your .htaccess file as follows:

  • To redirect a http://mydomain.com to http://www.mydomain.com:
    RewriteEngine on
    RewriteCond %(HTTP_HOST) ^mydomain.com
    RewriteRule (.*) http://www.mydomain.com/$1 [R=301,L]
    
  • To redirect your http://www.mydomain.com/index.html to http://www.mydomain.com:
    RewriteEngine on
    RewriteCond %(THE_REQUEST) ^[A-Z] {3,9}\ /.*index\.html\ HTTP/
    RewriteRule (.*) index\.html$ /$1 [R=301,L]
    
  • To redirect your http://www.mydomain.com/index.php to http://www.mydomain.com:
    RewriteEngine on
    RewriteCond %(THE_REQUEST) ^[A-Z] {3,9}\ /.*index\.php\ HTTP/
    RewriteRule (.*) index\.php$ /$1 [R=301,L]
    

Note: don’t forget to replace “mydomain” by your actual domain name.

What This Code Does

By adding those commands to your .htaccess file, you are telling the search engines that regardless of which of those URL is requested, they should take you to http://www.mydomain.com

The method used is called a 301(permanent) redirect. It is an extremely powerful SEO tool as any page moved or redirected by this mean doesn’t lose its value.

It may sounds a bit complicated and technical, but truly as far as you are concerned, it’s only a matter of copy and paste (and replacing “mydomain” with your actual domain name).

Congratulations, you now can enjoy full benefits from your SEO work!

Article by Nicolas Prudhon

Nicolas has written 3 awesome article(s) for us.
Visit Nicolas's blog

Nicolas Prudhon is an Internet Marketing & SEO strategist, as well as published author. Through SEO Help by Nicolas Prudhon, he's dedicated to share all his knowledge and experience. Join his latest free SEO training course "21 Days SEO Mastery" now!

Summary

URL Canonicalization is a very important, but very neglected step in SEO. Learn what URL Canonicalization is, and multiple ways to implement it on your website.

Key Points

  • Files such as index.html, index.php, etc. return the same page, but will be looked at as two different pages by Search Engines.
  • URL Canonicalization can be achieved with some simple .htaccess edits (AKA, 301 Redirects).

Similar Articles

Stay in the Loop!

Did you love this post? If you did, there's more to come (and plenty to catch up on) with a variety of ways to stay up to date:

{ 4 trackbacks }

SEO Help by Nicolas Prudhon » Social Networking
March 23, 2009 at 8:47 pm
SEO Help by Nicolas Prudhon » WordPress Tutorial
March 25, 2009 at 8:45 pm
SEO Help by Nicolas Prudhon » SEO Training Week 7
March 30, 2009 at 1:30 am
4 Stand-Out SEO-Blogs & Their Priceless Resources Pt. 2
May 30, 2009 at 5:12 pm

{ 41 comments… read them below or add one }

1 Alex March 23, 2009 at 2:57 pm

Thanks for the great post Nick. I’ve never heard of URL Canonicalization, but this is something that I’m now going to apply to this blog, and any future blogs. I’m no SEO genius, so this really helps.

Thanks a lot, and I hope to see some more posts from you soon! :)

Reply

2 Nicolas Prudhon March 23, 2009 at 8:20 pm

Actually Alex, despite the fact that this article is actually much more “technical” than what I write usually, it is exactly because a lot of people never heard about it that I though it may come handy ;)

Anyway, I’m glad to be of some help and thank you for publishing my article so fast!

Nicolas Prudhon’s last blog post..SEO help with Niche Marketing

Reply

3 Stuart Conover March 23, 2009 at 4:34 pm

The major search engines actually added a tag recently that allows you to tell them what url you want for canonicalization. Within 2 days of the announcement there were 3 wordpress plugins that were out to auto create the tag on pages for you. Just another route to look into.

I really hate to link to one of my own sites in a comment but I did a writeup a QUICK post here: http://www.stuartconover.com/2009/02/19/canonical-links-plug-ins/ with links to 2 Wordpress and 1 Joomla plugin. Figured they might prove useful after today’s post ;)

Reply

4 Alex March 23, 2009 at 8:19 pm

That’s pretty interesting. Thanks for sharing that Stuart!

Reply

5 Nicolas Prudhon March 23, 2009 at 8:24 pm

Hi Stuart, I believe that it is not the “link” itself that gives a spam feel, but rather if it is related and helpful to the discussion or not.

As Alex mentioned earlier, people have very few information about it, so whatever can be useful and helpful for their understanding is more than welcome!

Honestly, I wasn’t even aware myself of those plugins so I have to thank you too for sharing this resources.

That’s really something I love about those blog discussion, we all can share and learn so much!

Nicolas Prudhon’s last blog post..Why my site is not indexed in Google?

Reply

6 Dennis Edell March 24, 2009 at 11:25 am

Will installing the plugins have the exact same effect? I’d rather not mess with the .htaccess if I don’t have to.

Btw, my commentluv shows another Nicolas guest post. ;)

Dennis Edell’s last blog post..3 Secrets to Writing for the Search Engines

Reply

7 Nicolas Prudhon March 24, 2009 at 8:45 pm

Hi Dennis, through what I read about this plugin, although they may look like doing the same thing, I think I found a major difference there.

“This simple plugin helps you easily tell the search engine bots the preferred version of a page by specifying the canonical properly within your head tag.”

The code used in the .htaccess file uses the 301 redirect which not only tells the search engines your exact canonical setting, but ALSO maintains and unify all your SEO starts (ranking, PR, etc…)

The code in your .htaccess file is merely 3 small lines, against having some extra meta information in the header of each and every page of your site.

It’s very easy for you to see if you have done the things right. After updating your .htaccess file, just try to access your site without the www; if the page that opens automatically is the one with www you have done the things correctly.

Nicolas Prudhon’s last blog post..Discerning mind against Information Overload

Reply

8 Dennis Edell March 25, 2009 at 1:48 pm

Thanks for that, I’ll give it a go.

Dennis Edell’s last blog post..3 Secrets to Writing for the Search Engines

9 David Lemcoe March 23, 2009 at 9:15 pm

Very well done. .htaccess and httpd.conf files make all the difference in SEO in most cases.

Great post,

Big D

Reply

10 Nicolas Prudhon March 23, 2009 at 9:21 pm

Thank you David,

I really appreciate the comment.

Nicolas Prudhon’s last blog post..Internet Marketing Starts with Personal Relationships

Reply

11 Dean Saliba March 24, 2009 at 9:21 am

I have also neveer heard of URL Canonicalization before.

Something I think a lot of us should be looking into.

Dean Saliba’s last blog post..Blogitive – Nice To Add To Your Collection

Reply

12 Nicolas Prudhon March 24, 2009 at 8:00 pm

Thanks for the comment Dean!

It’s not really surprising actually. As human we don’t really ask ourselves this kind of question. Common sense tells us that it’s the same page that is returned… but search engines don’t have common sense.

Nicolas Prudhon’s last blog post..Discerning mind against Information Overload

Reply

13 Alex March 24, 2009 at 8:39 pm

I have always realized that index.html, index.php, etc. are all different pages, but never really factored in the SEO of it all. Yet, I knew that using WWW and Non WWW is a no-no, and you should redirect them just like with the index pages. Seriously disappointed with myself for not seeing that at first, haha.

Reply

14 Nicolas Prudhon March 24, 2009 at 8:50 pm

Just a quick note about that.

Although my example shows you how to redirect a non www site towards a www site, the opposite can be done too.

From a SEO point of view it doesn’t matter which version you choose, with www or without www. What matters if for you to be consistent with your choice.

Nicolas Prudhon’s last blog post..Discerning mind against Information Overload

15 Miami web design March 24, 2009 at 9:51 am

yes , valid point raised ,Google see links with www and without www as different pages hence your link juice is split between two pages . So first thing while doing on-site optimization we do is to redirect non-www url to www url

Reply

16 Nicolas Prudhon March 24, 2009 at 8:06 pm

You’re absolutely right. Actually most of the people will only think of redirecting their non-www towards their www url, but it’s good to include the index page, when we think about it, it’s the same too and it doesn’t take much more work to do so too.

Nicolas Prudhon’s last blog post..WVO is taking over SEO Part 3

Reply

17 Kai Lo March 26, 2009 at 10:56 pm

I’m glad I started blogging with a purchased domain url. If I didn’t, I would have a lot of backlinks to a domain.blogspot.com link instead.

Kai Lo’s last blog post..Web Domain Value

Reply

18 Nicolas Prudhon March 26, 2009 at 11:38 pm

Considering how cheap domain names are now, I would definitely recommend people to get one as soon as possible. It’s really sad to see your hard work going to waste just because you didn’t planned properly at first.

Nicolas Prudhon’s last blog post..WordPress Tutorial

Reply

19 Evan March 30, 2009 at 5:10 pm

Great informational post, never knew about this sort of thing.

Had to rattle my brain a little for this one! ;)

Reply

20 Nicolas Prudhon March 30, 2009 at 6:24 pm

Hi Evan, sorry for this post to be a bit technical, however without going into the details in the meaning of the code itself, the concept is not that hard to understand.

I’m glad that I have been able to help you there.

Nicolas Prudhon’s last blog post..The 5 Best Friends of the full time Internet Marketer

Reply

21 Davey March 31, 2009 at 1:39 am

So that means that www or non www, it doesn’t matter? I mean I can change the domain name of my blog by going to wordpress settings and changing the blog URL. But what do you suggest? getting a www attached to my blog name or just keep it as it and maintain the uniformity? Does it have something to do with the SEO?

Reply

22 Nicolas Prudhon March 31, 2009 at 5:24 am

Hi Davey, that’s correct, www or non www is the same, it’s just a matter of personal preference only. Whatever form you use, make sure to keep using it everywhere, if you use the non www make sure not to include www in your inbound links.

And yes, it has to do with your SEO and where you distribute your “link power”, the tip showed in this article is to help you channel all those leaks into one uniform URL.

Reply

23 Andrew April 23, 2009 at 4:34 am

I couldn’t get it to work. The www redirect did nothing and the the other 2 gave error 500. My domain is running on RedHat host. I changed the mydomain to the correct one.

Reply

24 Nicolas Prudhon April 23, 2009 at 8:50 pm

Hi Andrew,

Quite surprising but anyway, try the following code:
===========================================
Options +FollowSymlinks
RewriteEngine on
rewritecond %{http_host} ^domain.com [nc]
rewriterule ^(.*)$ http://www.domain.com/$1 [r=301,nc]
===========================================
Note: don’t forget to change “domain” by your actual domain name!

Let me know how it goes.

Nicolas Prudhon’s last blog post..Targeting your Market with Google Trends

Reply

25 Andrew April 25, 2009 at 11:45 am

Thanks Nicolas!
That worked.

Reply

26 Nicolas Prudhon April 25, 2009 at 12:01 pm

Awesome Andrew!

If you have any other questions, don’t hesitate to ask, I’m here to help!

Nicolas Prudhon’s last blog post..Can We Survive the “NoFollow” Black hole?

27 Mike Dalton November 17, 2009 at 1:45 pm

Thanks Nicholas – Great resource
I’m already using the redirect to the www – wished I’d run across this a year ago!
Howver, I am getting 500 errors trying to use the code to get the .com instead of the .com/index.htm (did replace the .html in your code with .htm)
I’m using Godaddy running Linux if that makes a difference.

Thanks much!

Reply

28 Manual Web Directory May 10, 2009 at 12:51 pm

Finally someone who can write a good blog ! . This is the kind of information that is useful to those want to increase their SERP’s. I loved your post and will be telling others about it. Subscribing to your RSS feed now. Thanks

Reply

29 Nicolas Prudhon May 10, 2009 at 7:29 pm

Yes, Alex and Janith are doing an excellent job with this blog!

Nicolas Prudhon’s last blog post..Optimizing for Keywords that Get Clicked On

Reply

30 Make Money Online May 28, 2009 at 11:48 am

I’ve tried it 2 days ago, I want to redirect my blog URL with W W W to without W W W. After I’d done it, I can’t access my blog anymore. So now I remain the same coding without changing anything.
Btw, with these way, can we also redirect single post as well?

Regards,
Lee

Reply

31 Nicolas Prudhon May 28, 2009 at 6:40 pm

Hi Lee, I replied to you on my blog.

Nicolas Prudhon’s last blog post..SEO Quiz – Questions and Answers

Reply

32 Miami Web Design August 3, 2009 at 4:16 am

Thanks. I’ll update my .htaccess files right away

Reply

33 ZQ | Travel Blog August 8, 2009 at 5:08 am

Great post – learned something new today – will edit my htaccess files now (now i realised how impt it is to the security of my blog)

ZQ
ZQ | Travel Blog´s last blog ..Saturday Singapore Issue 3: Singapore Flyer

Reply

34 Nicolas Prudhon August 9, 2009 at 12:39 am

Actually the primary use is for SEO, but it does indeed potentially also solve some security issues. ;)

Reply

35 Adam Baird August 19, 2009 at 5:04 pm

Doesn’t Thesis automatically take care of this for you?
Adam Baird´s last blog ..A Lesson From Moving: Consistency!

Reply

36 Seth August 19, 2009 at 5:20 pm

Yes it does! Another great reason to get Thesis Theme.

Reply

37 web Gift September 12, 2009 at 9:02 pm

thanks for sharing this info :)

Reply

38 Volksphone September 19, 2009 at 9:33 am

Thanks for this mod_rewrite tutorial. I have always problems to set it up correctly.

Reply

39 New Cars Guru September 22, 2009 at 2:14 am

The best way for me is to choose http://www.domain.com over domain.com. It seems more natural to me this way. And if you have set up one way or another, please remember NOT TO SWITCH later, when you have brought your blog to some level of development, as it may lower your PR rank, and SERP results…
New Cars Guru´s last blog ..New Porsche 911 GT3 RS

Reply

40 Prisqua October 22, 2009 at 9:00 am

Thanks for the info but when I tried to add those codes to my .htaccess file I got a “500 Internal Error”. I saw on another comment you are giving another code, but it is to replace the ones you mention? I am just a bit confused…
Prisqua´s last blog ..Twitter Weekly Updates for 2009-10-18

Reply

41 Nicolas Prudhon November 10, 2009 at 1:41 am

Hi Prisqua,

Sorry for the late reply.

Yes the other code I give is to be used instead of the previous one. Depending of your server configuration and .htaccess file you may have experienced some difficulties with the first code, but this should be fine with the other one I provided.
Nicolas Prudhon´s last blog ..Staying Ahead Of Your Competition In 5 Steps

Reply

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

CommentLuv Enabled