Duplicate content test and URL canonicalization
Days ago I uploaded the following script on my server:
<?php if ($_SERVER["QUERY_STRING"]=='foo&bar') { echo "index test one"; } if ($_SERVER["QUERY_STRING"]=='bar&foo') { echo "bar and foo"; } if ($_SERVER["QUERY_STRING"]=='bar&foo&test') { echo "bar and foo"; } ?>
I then published 3 links to my site’s index so Google could follow them:
http://cherouvim.com/foo.php?foo&bar
http://cherouvim.com/foo.php?bar&foo
http://cherouvim.com/foo.php?bar&foo&test
Days later I got this result for the Google query site:cherouvim.com/foo:
The first and third result are the same (duplicate content). Google has indexed them both though. This is a common SEO problem in dynamic web sites where there can be many different URLs linking to the same page (paginators, out of date URLs, archive pages etc) or where you want to do URL Referrer Tracking.
Google has recently published a way of overcoming this problem. You can now specify which is the real (or primary) URL for the page. E.g:
<link rel="canonical" href="/foo.php?foo&bar" />
So, as SEOmoz said, this definitely is The Most Important Advancement in SEO Practices Since Sitemaps.
July 13th, 2009 at 17:58
I used to publish my articles, but now I wander should I stop doing this, because the risk of duplicate content penalty. Should I stop publish my articles on article directories?
December 13th, 2009 at 14:46
Introspective,
Duplicate content theory as we know it applies only to on-site duplicate content.
Aproximately 30% of all the web’s info is duplicate content. Just think about how much duplicate content is passed between prominent news agencies world wide – surely you’ve read breaking news with the same content on multiple news site.
It depends on that particular article directory if it tolerates duplicated articles. From your SEO perspective is better to spin the article so that SE perceives it as being unique content. This permits you to dominate top 10 positions in SERPs for certain keywords.
After you’ve posted the same article on multiple directories for a given period of time this same piece of content will show up multiple times on the same search result page, but as Google and other SE filters the new information it will only retain one copy of that article, generaly the one placed on the first article to be indexed with that copy or the directory with the highest ranking score.
I’ve made I post on duplicate content on my bog, if you’re interseted in the subject. Here’s the link: http://trafficcpanel.com/871/duplicate-content-is-your-business-website-silently-infected/
Hope it helps!
Cheers
Cristian
December 14th, 2009 at 0:00
In these times were social sites, twitter, facebook and stuff are taking over the net, it’s hard to tell what’s duplicate content. If someone bookmarks my post on digg, mixx, delicious and other how can i prevent that my ‘duplicate content’ is being distributed over the internet and harm my true content and web page? In my opinion search engines doesn’t really put so much attention to duplicate content cos it really can’t detect what is true and what is duplicate.
May 5th, 2010 at 21:05
I usually submit 300 word articles on article directories to help me gain backlinks and readers.:*;