What is duplicate content? (And how to deal with it)

May 7, 2019 0 comments

What is duplicate content?

Duplicate content is simply content that appears on multiple places on the internet. For example, if website A has published a piece of content that was already available on website B, that’d be considered “duplicate content”.

How does Google react to duplicate content?

As you may already know about it that Google — or any other search engine for that matter — does not like duplicate content.

If Google finds out that a website is plagiarising and duplicating content on the internet, it will penalise that website and demote it in the search engine results pages.

The problem that Google faces because of duplicate content is that the search engine does not always know which web page to rank for related queries. Directing link metrics and authority to one page or keeping it separate between multiple versions is another issue for search engines.

On the other hand, for website owners, duplicate content has multiple negative effects:

Since search engines do not show multiple web pages in the SERPs, visibility of those web pages get diluted. If Google decides that you are duplicating content, your website won’t be ranked in the SERPs. This will result in a loss of traffic and credibility.
If a website owner has multiple versions of the same web page on her site, link equity will be diluted. Other websites will have to choose which web page they should link to, and that can impact the visibility of a page in the SERPs.

3 reasons duplicate content issues may happen

Sometimes, duplicate content issues may happen even without your knowledge. Here are 3 common reasons why you may encounter this problem:

URL variations can often cause duplicate content issues. For example, if each user is being assigned a different session ID, you may encounter different URL variations of the same web page.
If your website has separate versions (with and without ), the same web page may live on both versions, effectively creating multiple web pages.
Similar product descriptions may lead to very similar (duplicate) content on your site. For example, you may create multiple product pages for a similar product to be used for different target audience and categories. E-commerce businesses often face this problem.

How to deal with duplicate content?

As you can guess by now, duplicate contents aren’t good — neither for search engines nor for website owners. So what should you do about it?

If you have multiple versions of a web page on your website, here are a few things you can do to avoid the aforementioned problems:

1. Rel=”canonical”

The rel=canonical attribute tells search engines that a given web page is a copy of another URL and, therefore, should be treated as such. By using this attribute, you inform search engines that your original web page has another version. Moreover, all the links, SEO juice, and other ranking power should be directed to the main version of the web page, not the duplicated one.

The attribute looks like this:

<link href=”URL OF ORIGINAL PAGE” rel=”canonical” />0

This attribute should be added to the HTML head portion of each duplicate version of the web page.

2. 301 redirects

Sometimes, the best way to a fix a duplicate content issue is to set up a 301-redirect and permanently redirecting search engines and users from the duplicate page to the original version that you want to keep.

Multiple versions of a web page often fight against each other for achieving a spot in the search engine results page. However, when you set up a 301-redirect, you stop that competition and, in fact, combine their SEO power together to support the main page. This can, sometimes, have a positive impact on the main page’s ability to rank higher in the SERPs.

3. Noindex meta tag

You can also add the content=”noindex,follow” attribute to the HTML head of a duplicate version for excluding it from the search engine’s index.

By doing so, you are allowing Google to crawl the duplicate web page, but you are not allowing the search engine to index in the SERPs.

Note: Google explicitly cautions against restricting its crawling access. That’s the reason you allow Google to crawl the web page, but the noindex attribute stops the search engine from indexing the page.

Conclusion

By using any of these three methods we just mentioned above, you can fix the duplicate content issues on your website. The exact method will depend on what you want to achieve with your content.