Has This Little Known Panda Theory Killed YOUR Site's Rankings?
- SEO |
Panda is largely about DUPLICATE CONTENT .
I don't think Panda looks for on-page issues, site speed or other factors. Why?
Because I have 6 near-identical WordPress sites, and two have been completely unaffected by Panda.
First of all, my theory isn't my own, it's this guy's:
Google's Panda Penalty: No Recovery After 1 Year
Read this carefully before you rip my post to shreds!
Executive summary:
This guy believes that Google trusts certain sites and knows that they will NEVER have duplicate content. Sites like the IRS, NY Times, BBC etc.
So rest assured, if you ever copy content from these sites, then you'll have a Panda penalty for dupe content.
I believe that successive iterations of Panda have allowed Google to "trust" more sites. Maybe there's a trust hierarchy. In fact they already have this - the page rank of a site's home page.
So I think what they're doing is flagging content as duplicate if it appears on a site with a higher rank than yours - and this is the important bit - even if you're the original content creator.
Yikes!
Google's Folly
Look, Google has a big problem. Before recently (2011?) they don't appear to have been recording where a piece of content first originated. Now they are recording it.
The problem is that I have content dating from 2002 or earlier. And it appears Google doesn't have a clue as to where it first popped up, because they don't appear to have ever recorded that fact.
Matt Cutts is very aware of the problem. How do I know? Because in one of his videos he responded to a guy asking what would happen if Site A published a document and Site B copied it then got indexed before Site A [so site A's content is effectively first published by Site B] .
Well Matt didn't really have an answer for this, which is why I suppose they're trying rel=author, using site maps and rapidly crawling the most trusted of sites. I can't find the video again, but I think he suggests using the web spam feedback at the bottom of the search page, or using a DCMA notice to take down the copy.
My Evidence of a Panda Duplicate Content Penalty
Panda is resource intensive and so it only updates once a month (or less).
This gives me clues that they're doing something big, like checking for duplicate content. Maybe they're checking for direct copies, as well as things that are similar (like spun/rewritten content). Maybe they also check for duplicate images, although I believe Panda only works on text.
I think that successive iterations of Panda haven't changed the algorithm much, but they've simply scaled it up to more sites (maybe they're trusting a lot more sites).
Anyway, here's my hard evidence that I've been smashed by duplicate content.
First my old sites (1999 and 2002 vintage). They've been hit hard, especially in the last Panda iteration.
Why?
Because my content is all over the web!
There are two reasons for this. Firstly, my software site used PAD files to distribute my software details, so I have pretty much identical content on loads of download sites. I also copied some of my own content from sites I used to write on. So I'm partly to blame for my own downfall.
Secondly, there's been a heck of a lot of content stealing, particularly from my 1999 blog. In fact I found some Indian programmer had actually copied one of my entire articles and posted it as his own work. And some travel blogger on blogspot had copied my entire article about visiting Japan. There are also many scraper sites that republish snippets of my sites and yours, maybe hoping that they'll rank for my keyword + your keyword.
Now this gives me clues that Panda ain't about quality. Because my stuff about Japan is unique and rare, simply because not many people have visited that country.
I've maybe also been sunk because the duplicate content is on sites with higher page rank than mine (blogspot, eHow clones, software download sites). As a result, I don't even rank for my own product name. Sheesh, Google have clearly screwed up here, because even DuckDuckGo can figure this one out!
Second, my new sites I started last year.
Some have been hit, some haven't. Why?
I think there are two reasons. I'll admit some of my content is junk. However, Google are a poor judge of quality because my site about red widgets got hit hard, but blue widgets escaped. Now I am a guru on red widgets (I have owned several), but I know a lot less about blue widgets (which I've never owned). Where I slipped up on the red widgets site is that I republished some of my banned HubPages on my own sites. Why waste content?
Big mistake!
This is a classic Panda spam signal - a small site copying content from a big site!
Finally, onto other people's sites.
Many other well respected sites have been hit by Panda. I believe that duplicate content is the issue. That Tim the Builder site did have thin content. But he also had a lot of content that was ripped off by eHow writers and legions of other sites. Maybe they didn't copy him word for word, but there are really only so many ways you can write how to fix a leaking tap. And once again, I believe that eHow would have a better trust rank than Tim's site.
Of all the copied content Warriors are likely to write, tutorials and how to's are most at risk.
Cookery sites have seen a lot of Panda smackdowns - again, a recipe is very easy to rip off, indeed cookery book writers have been doing it to each other for 100's of years!
Are YOU Affected?
Are you affected? Just search for the first paragraph of some of your prime copyable pages (tutorials, really popular content) in double quotes in Google.
If you find identical content, you could have a problem.
If you find identical content outranking your own site, you've definitely got a problem!
How to Fix Panda
OK so if you've copied content from a larger site, then you're dead in the water. Remove the copied content and hope for the best !
Beyond this...
First, make sure you've registered your site in WEBMASTER TOOLS and have made a SITE MAP.
Second, add rel=author tags to your content. I don't think Google uses this for duplicate content checking, but one day they will.
Third, if your site is older than 2009 and you've never used a site map, then maybe search for the first paragraph of some of your content (or use copyscape). Then take down the duplicates, either by DCMA requests or by emailing the hosting company listed in the WHOIS (this is apparently more effective than contacting the site's webmaster). If this theory holds water, then your priority should be to go after duplicate content on high page rank sites - don't worry about 5 page EMD's or crappy autoblogs.
Fourth, if your site is big and not vaguely MFA or anything then ask for a reconsideration request, and give Google evidence that you're the original owner of the content, and show who has stolen it.
Fifth, write FEWER pages, and make them LONGER. This will allow you and Google to more easily check for duplicates.
Sixth, make content that CAN'T be copied so easily, like YouTube Videos or Facebook social pages or forums. It's surely no coincidence that sites using these haven't been so badly hit by Panda.
Seventh, write NEW content to mitigate the penalty of having duplicate content.
Holes in My Theory
I've never been able to recover from Panda. But up till now I've really been addressing thin content, and not duplicate content.
I also don't know how long the content copying penalty lasts, and whether it still applies if the original document disappears (like my banned HubPages).
Now I'm not suggesting Panda only looks at duplicate content. However, I think it is by far the biggest factor, especially if you've only been doing white hat stuff, little link building, and you're honest enough to admit your content is great.
Anyway, I'd be interested in hearing if anyone else thinks they have been sunk by their own content appearing on other sites. Hopefully this theory will give you something to focus on.
My Action Plan
Look, I'm screwed! I want my software site back in Google because I have a real product and it's exactly the site that should rank well. Yet my content is so widespread that it would take years to DCMA it all down.
So I'm going to scrap my software site, and rebuild it with 10% of the pages it once had. I'll also add Facebook content and some videos.
I think I can save my blog, and what I'll do is to delete the worst of the duplicate content plus issue DCMA notices for the copied content.
I'll post updates if I make progress in getting my rank back, so watch this space !
If you were disappointed in your results today, lower your standards tomorrow.
* Get Results - Outsource Your PPC Management |
* Don Burk Advertising & Marketing - www.donburk.com
* Get Results - Outsource Your PPC Management |
* Don Burk Advertising & Marketing - www.donburk.com
* Get Results - Outsource Your PPC Management |
* Don Burk Advertising & Marketing - www.donburk.com