
The following is a Guest Blog Post written by Lovedeep Wadhwa who is a technology analyst and gives you different blogging tips and Internet related news at his blog Freakitude. Please visit his blog for further articles on blogging and technology.
Wordpress Duplicate Content Bug
Greg Mulhauser brought into my attention a Duplicate Content Vulnerability present in Wordpress and Movable type.
If you are a Wordpress user using permalinks on your blog then you must notice that the content of your posts is accessible from infinite number of different urls. You just have to append a sequence of extra digits to the end of a post’s URL.
For Example take a look at this latest post from Matt Cutt's Blog.
The same post is also available on these urls:
http://www.mattcutts.com/blog/minty-fresh-indexing/123456/
http://www.mattcutts.com/blog/minty-fresh-indexing/45678/
When we try to access the post from these urls. Wordpress doesn't return a 301 redirect or a 404 error but simply makes the post content available on these url.
Fix The Duplicate Content Bug
If you are on a self hosted Wordpress blog, you can fix the vulnerability by placing the following rules in your htaccess file.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} !.*(/page/[0-9]*/?)$
RewriteCond %{REQUEST_URI} !^/200[0-9]/?$
RewriteCond %{REQUEST_URI} !^/200[0-9]/[01][0-9]/?$
RewriteCond %{REQUEST_URI} !^/200[0-9]/[01][0-9]/[0-3][0-9]/?$
RewriteRule (.*)(/[0-9]+/?)$ $1/ [R=301,L]
</IfModule>
If someone tries to access the post from these urls will get a SEO friendly redirect to the original post.
If you are using www preference on your blog, there are some special precautions to take to avoid a double redirect.
I prefer you read this post to know more. The post also gives info about fixing this duplicate content issue for Movable Type blogs and Wordpress MU blogs.
















Sizlopedia is a technology blog created by Saad Hamid, a problogger from Pakistan.
Lovedeep Thanks a lot for bringing this issue to the concern of all bloggers out there.
Especially for bloggers like me who are very much concerned about their blog search engine optimization this post is truly a must read.
I am scared only because I do have the www preference set on my blog and I am still confused how to fix the bug without triggering a double redirect.
Many thanks for bringing this news to light. And thanks for the remedial measure as well
Thanks for the comments.
If you are using www preference, insert http://www.yourdomain.com/ in front of the $1 to avoid a double redirect.
RewriteEngine On
RewriteCond %{REQUEST_URI} !.*(/page/[0-9]*/?)$
RewriteCond %{REQUEST_URI} !^/200[0-9]/?$
RewriteCond %{REQUEST_URI} !^/200[0-9]/[01][0-9]/?$
RewriteCond %{REQUEST_URI} !^/200[0-9]/[01][0-9]/[0-3][0-9]/?$
RewriteRule (.*)(/[0-9]+/?)$ http://www.domain.com/$1/ [R=301,L]
Thanks, Lovedeep and DJ,
Thats a very good tip. I’ll try it and see if it works to get better SEO.
Now its harder to know about supplemental pages, because of the Google changes, since supplemental pages are usually because of duplicate content.
Lovedeep Buddy thanks a lot for helping me out
DarrinW You are welcome
You are welcome DJ
Fantastic tip – thanks very much for the heads up.
Thank you, this is an amazing post, great share.