Google Supplemental Results and Wordpress Duplicate Content Scare

by Jack Humphrey

We have a discussion going on in the ASC forum about Google supplemental results issues for Wordpress sites. Some people are getting a high percentage of supplemental results for their content.

I even have a high percentage of pages in supplemental results. Which I didn’t know until today for 2 reasons:

1. I never check.

2. I get so much traffic from Google it never occurred to me that a lot of my content might be in the dead zone.

When I look at the results in the in Google, I find a lot of pages that are not relevant to the discussion I have going on currently, very old pages, and just plain weird pages. Still there are some pages in there that aren’t duplicate content that I’d like to get out of Google Hell.

I assume since Google has tackled research like building a “space elevator” that they would have already figured out that the world’s most-used blog platform, Wordpress, along with any other blog platform, is simply a database query and reporting tool.

That is, all the data for this site is in a database. This post you are reading is the only occurrence of this information in my database. Yet it can be accessed in different ways and with different urls. This post will show up in a category result, a tags page result, on-site search result, and so on. But it is still just one unique piece of content. Rocket science aside, seems like an issue that’s fairly easy to resolve on Google’s end. But they haven’t done it yet.

So I don’t have duplicate content on this site or in my database, I just have several different ways the same content can be accessed.

So do the millions of other bloggers and database content management systems in action around the web. Why then do people have such high supplemental results ratios?

I’ve decided to ask around, starting here, to see if any of the uber geeks who read my blog have insight on this issue. Please do comment and give your input including any plugins anyone may have done (or are working on. Hint Hint) that would block robots from going down all the pathways that are available in Wordpress to the same content.

Update: A Wordpress plugin you can check out is called “Duplicate Content Cure.” Be careful telling this thing to “no follow no index” categories or tag results pages. Google lists a lot of the tag results pages and category pages higher for me because, I assume, there’s better keyword density for all my results for a keyword (tag) search than any individual post by itself in that category.

Be careful that the “cure” doesn’t make you sicker and really read and understand what you are blocking robots from indexing to make sure you aren’t cutting off good results you already have.

Have you used this plugin? Tell us your results below!

One of our coaches found some interesting articles on the subject. Here is one that talks about the DuPrevent plugin and an example Robot.txt file he uses to help prevent spiders looking at the same things in your database twice. (Read the comments for a correction a person submitted about this guy’s robots.txt file)

This is all that is happening essentially so don’t start feeling like you are on a bad platform for publishing. This is Google’s problem, not yours. You just need to fix it because waiting on them to is out of the question.

ASIDE: Google even puts Blogger.com blog pages in supplemental results. So they don’t have this figured out yet even on their own system, lol.

Chime In! (Google reps welcome!)

How are YOUR supplemental results and what have you done to your Wordpress or other blog system to prevent in Google? Comments welcome and appreciated…

Don't Miss Out On Free Traffic!
Subscribe to the FTR RSS feed or our email list so you don't miss out on real, traffic driving tips from Jack Humphrey!  Thanks for visiting!

Tags: , , ,

Related Wisdom

{ 3 trackbacks }

Wednesday Link Love | Pajama Professional
06.13.07 at 11:49 am
vickyandvali.org…another personal blog! » Blog Archive » Wordpress and Google
07.10.07 at 2:02 am
Duplicate Content Nightmares | Simon Emery
12.07.07 at 10:34 am

{ 6 comments… read them below or add one }

April Kerr 06.13.07 at 9:10 am

I’m pretty new to blogs so still have to change my old FrontPage sites to blogs.

I find Google very fickle. One site I got out of supplemental 1 week after getting into 3 of the premium paid-for directories, whilst another site is still largely supplemental despite spending several hundred $$ on directories including Yahoo.

One thing you may want to consider is checking for broken links, for instance this page not longer exists:

“http://www.jackhumphrey.com/fridaytrafficreport/copywriting-tips/”

On the only wordpress blog I own I have the All in One SEO plugin which allows you to block archives.

April

Liz Tomey 06.13.07 at 9:29 am

Okay, here’s what’s not making sense though. There are so many definitions of duplicate content. Some think it’s just putting text on your site that is on other sites. Some think if you have the same menu bar, and footer information (like copyright notice and such) that Google sees that as duplicate content.

So, before you can fix the problem of Google indexing our posts from tag pages, category pages, and the such, don’t you first have to figure out what Google considers duplicate content?

I have over 100 content sites. I have tested the heck out of this duplicate content stuff, and the only time I have seen my SE rankings drop is when I used a cookie cutter site…

So before we fix a problem, don’t we have to see what the problem is?

And… What could these new “dupe cures” do in terms of effecting our blogs in a negative way?

Liz

Jack Humphrey 06.13.07 at 10:00 am

Liz,

That’s how I’m feeling about the issue. That DupeCure thing, once you turn it on, automatically blocks category pages, etc. where some of the richest keyword density is.

I have pages like tag results pages that rank in the engines over individual posts in those results because of the density (what other reason would it be?)

So I turned it back off. Wanted to see what it did and I dont like any plugin that just takes over something this serious without easily being able to choose what it blocks indexing on.

You can change things but you have to get into the code. This plugin looks mroe and more like link bait “I have a plugin” kind of thing rather than something created from an authority position on what to do with this issue.

Liz Tomey 06.13.07 at 10:25 am

I agree, Jack!

I think the whole duplicate content thing is more of a scare tactic, and peole trying to create a market than anything.

Every time a new “dupe content” thing comes out, my readers and coaching clients blow up my phone, blog, and email about it. Everyone is so scared of it.

I’ve tested and tested and will always test the duplicate content theory, but this is what I have found…

If you have a site that isn’t a template site that everyone and there brother is using, and your content is about 75-80% unique, you’re good to go.

If dupe content was an issue, the blog memes, the link baiting and all the other stuff we do as bloggers would have us all down in the bottom 1 million of the search results.

Just my thoughts… :)

Liz

Elizabeth Adams 06.13.07 at 11:06 pm

Re: Duplicate Content and Supplemental Index

Hello, Jack …

Well, if it was me, I’d start here, with the idea that, if I was a search engine, then I wouldn’t want, let’s say, four instances of the same page in my database.

But, since I’m a search engine, I can’t figure out which one of the four I should keep, so I solve the problem by throwing them all out. Simple!

So, if you don’t want all four of your pages thrown out, then you just tell the search engine which one you want to keep. Simple!

Does this mean that the other 3 pages won’t still be on your blog?

No, it doesn’t. They’re still “there”. Real, regular people can still see them … and link to them … and you’ll still get credit for those links.

They’re just not in the search engines because you had the good sense to use meta name = “robots” content=”noindex, follow” in your page headers for the other three instances so the search engines will know that that the content isn’t for them. Simple!

Of course, if it’s after the fact … if you’ve been building a blog for years and have thousands of pages … then it’s a job of work to go in there and edit the code to conform to this concept.

But then again, if you’ve been blogging for years, you probably have lots of “link love”, and it’s my understanding, from reading a transcript of Google’s Matt Cutts, that lots of backlinks covers a multitude of sins and saves you from “Supplementals Hell”.

Still, if you’re just starting up, or have only been blogging for just a few months, it seems to me like it wouldn’t hurt to tell the search engines which links to index and which links to leave alone.

That way, the search engines are happy because there’s only one instance of any given page of yours in their results, and you’re happy because all of your pages are *both* listed *and* available to your visitors in several cross-referenced categories.

Warmest regards …

Elizabeth

P.S.

It’s my understanding that “quality of content” per se really isn’t the issue because, apart from obvious keyword stuffing or something black-hat like that, the spiders really can’t judge content quality. So the only thing the search engines have to fall back on in that regard is “link love”, on the theory that people wouldn’t link to you if they didn’t love you. This seems a rather fatuous assumption to me, but then I’m not a search engine. You just never know what a robot is going to find exciting!

:)

theDuck 06.16.07 at 3:51 am


Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Older post: Link Building on a Budget Part 4: Press Release Tactics

Newer post: Google Supplemental Results: Matt Cutts Clears It Up