Understanding Duplicate Content Penalty in SEO

Why Is This Myth Still Alive?

SEO CONSPIRACY S01E17

Is duplicate content a penalty or not?

It really depends on how you call it a penalty but no, is a sure answer. No, Google would tell you that there is no such thing as a duplicate content penalty and they’re right. They don’t have any math’s that says this is a duplicate content penalty and that’s a problem.

Unless you’ve got this same, identical content on different websites around the world and then maybe but the more common thing is that
even on a standard WordPress site, the content repeats on many different URLs. So they’re not going to penalize a standard WordPress installation. If you write ten articles, you’re gonna have some content on the home page that’s also on the category page or the index page itself.

Even when you get to the individual page, you can get to that page with a ?p= and the post number, a snippet or a slug with a URL as well so there are so many ways in which content can be duplicated by URL or within a page of content that of course Google doesn’t want to penalize.

What it does lead, though, is the SEO to try and minimize that because if you have two different versions of the content and two different versions of somebody consuming that content, then you have cannibalized your content and that’s where it goes wrong.

You don’t get into trouble for having two URLs with the same content on, what you have is half the value and power going to each of those pages.

I don’t think there’s a duplicate content penalty, I think that laziness and not trying to fix it is meaning you’re not going to optimize your SEO chances.

Duplicate content is a flaw

I proved it back in 2006 with my team of SEO hackers, the dark SEO team, we made a demonstration with, of course, always the same target, Matt Cutts blog.

I forgot the keyword but it was like bacon-polenta, but we deranked Matt’s blog

My friend Paul Sanchez did the same demonstration on Matt’s homepage and threw it out. The page didn’t come back until Paul stopped his experiment.

The fact is it’s a flaw that existed back in the days and it’s not fixed today, there’s a problem. If I remember correctly it was back in Matt Cutts days. Obviously I think we were listening to Google’s voice more back when Matt Cutts was there than today.

So I don’t know if that number is still right but I remember Dixon Jones saying that 30% of the web was duplicated mathematically. You just create a little PHP file that changes the URL every time the page
loads so that means you’ve got an infinite number of pages with the same amount of content. Divide any number of pages on the internet by infinity, that means all the pages have the same content on the internet, on average.

it’s percentage of the internet, it’s a very dangerous thing to say because there are an infinite number of URL variations on the internet.

Near Duplicate Content

For example, let’s take the press world, the official press.

So, they’re gonna release a little news and that snippet is duplicated thousands of times. It’s also rewritten.

I think there’s a problem with duplicate and there’s also problem with near duplicate. Whether it’s the same content on the wordpress tags or someone who is rewriting the content.

So, a Google penalty for duplicate content would just not work.

I think the news syndication is an interesting example of duplicate content because fundamentally it still plays to my perspective as well
because the official press have syndicated the content out to multiple different places, they’ve already monetized it.

But the issue with that, if they sell out the news to lots people, all the snippets of the news, then they don’t become the authority anymore.

They’ve given away the authority.

And mostly, I think that has helped to create a sort of a little bit of duplicate content that’s all over the place, but all of those bits of content will lead back to Rome or should lead back to Rome.

The main content, the main story is from one writer so I still think that the challenge for, whether it’s ecommerce duplication, WordPress duplication or new syndication duplication, is still to try and control that journey so there is still one point of truth that’s pointing back to you, so that Google knows where the point of truth is.

I come from the black hat world and now I’m not because it’s not worth it anymore.

I was a lazy black hat, I always wanted to automate content because it’s easy.

I tried everything and the first content that worked was with a tool called Yahoo Pipes and you were able to mix RSS feed so you could
mix all those feeds and it would become a unique content. It worked very well at first, the threshold was about like 30%.

If you had less than 30% of duplicate content, then Google was eating it very well.

Again it depends on what you’re going to target, if it’s a low competitive niche, then you could pray Saint Crap.

But if you want to challenge the big boys with automatic content good luck.

To finish with Duplicate Content, you have internal duplicate content within your site and you have external duplicate content.

So you should put a canonical, you should canonicalize your content so Google knows which one is the authoritative version of any
particular content.

That’s the least you could do.

When you’re affected by duplicate content, it’s because your website is not strong enough. Google has trouble picking up the right source.

Listen to the podcast

Watch the video

Latest posts

SEO CONSPIRACY S01E17

Is duplicate content a penalty or not?

Duplicate content is a flaw

Near Duplicate Content

Related Posts