We know that Google and other search engines want to show searchers reputable, relevant, and up-to-date content to help meet their needs. You’ve no doubt heard of the importance of uniqueness and originality in that piece of content, too.
And yet duplicate content occurs in a number of different ways, not all of them malicious or even negative in nature. It’s a reality for businesses of all sizes, but even more common particularly for enterprise brands and those with multiple locations.
What is the true impact of duplicate content on SEO and does it negatively impact your search ranking?
In this column, we’ll take a look at what constitutes duplicate content, what Google says about it, and what you need to know about it to avoid any negative repercussions that could impact your brand’s search rankings and visibility.
What is Duplicate Content?
Duplicate content refers to identical content; any passage of text that is the same as another appearing elsewhere online.
When we talk about duplicate content in terms of SEO, we are not speaking of plagiarism, scraping, or other content that is copied for malicious or illegal reasons. Duplicate content occurs naturally in a number of different ways; for example, when you include a quote from another online source, or when a manufacturer’s product description appears on various reseller websites.
What Actually Constitutes Duplicate Content?
In its documentation for developers, Google notes that duplicate content generally refers to “generally refers to substantive blocks of content within or across domains that either completely matches other content in the same language or are appreciably similar” and lists several types of duplicate content:
- Discussion forums that can generate regular desktop and stripped-down mobile pages
- Items in an online store that are shown or linked to by multiple distinct URLs
- Printer-only versions of web pages
Enterprise brands often have duplicate content on local landing pages, Google Business Profiles, and other types of local listings. This could include product or service descriptions, the company’s mission statement and vision, taglines, promotions, and more. Duplicate content can occur on a single domain or on external websites.
You may even create duplicate content intentionally through the process of content syndication. This is a media tactic that’s been used for generations to expose new audiences to content. For example, many newspapers carry columns from AP, and this does not make the content any less reputable. In fact, syndication speaks directly to the quality and authority of the content; it wouldn’t be featured in newspapers and on news websites across the country unless there was value in doing so.
As a multi-location brand, you might use syndication to distribute blog posts to a wider audience across all or some segments of your locations’ blogs. You might republish blog posts on additional platforms such as Medium or LinkedIn to create more awareness and appeal to a new audience. Content syndication creates duplicate content, and yet it’s a widely accepted practice. Google just needs to know which is the original content creator.
Is Duplicate Content Bad for Your SEO Strategy?
There has been some confusion over the years about the impact of duplicate content on SEO, with some stating that Google will penalize a site for having it. This is not true – there is no Google penalty for duplicate content.
Google will not take manual action to suppress your pages or remove them from the index if duplicate content is detected unless it triggers a manual review.
Google explains, “In the rare cases in which Google perceives that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we’ll also make appropriate adjustments in the indexing and ranking of the sites involved. As a result, the ranking of the site may suffer, or the site might be removed entirely from the Google index, in which case it will no longer appear in search results.”
However, despite there being no duplicate content penalties, it will not help your rankings, either. What’s more, if Google cannot tell which is the original article or page, one of the duplicates could outrank the original. This can have real consequences if a competitor’s page or one local page is outranking the page you want to drive traffic to for a valuable keyword.
What matters most here is that we have a way to let Google know when content is being republished and which page contains the original, whether it’s on the same website or external.
How To Help Google Recognize Duplicate Content
It’s important that Google is able to recognize among the duplicate pieces of content which version is the original. There are bad actors who scrape websites and steal content to reprint in an effort to manipulate search rankings – and you do not want Google wondering if that may be the case with your content. Search engines strive to show the most valuable original answer in response to each query.
The process by which Google determines which URL contains the original content is called canonicalization.
What is Canonicalization?
Google explains how it uses canonicalization: “Google will choose one URL as the canonical version and crawl that, and all other URLs will be considered duplicate URLs and crawled less often. If you don’t explicitly tell Google which URL is canonical, Google will make the choice for you, or might consider them both of equal weight, which might lead to unwanted behavior…”
Among the reasons you would want to use canonicalization to make it clear which is the original, Google lists:
- To specify which URL you want people to see in search results.
- To consolidate link signals for similar or duplicate pages.
- To simplify tracking metrics for a single product or topic.
- To manage syndicated content.
- To avoid spending crawling time on duplicate pages.
This Google video explains how and why Googlebot chooses the canonical URL for duplicate content.
Putting Canonical Links To Work
If you aren’t sure whether Google already considers one page the canonical version, use its URL Inspection Tool to see. It can also be worth checking if you notice a specific page underperforming, as Google may ignore your canonicalization instruction and choose a different page.
There are several different ways to indicate which page is the canonical version:
1. The rel=canonical link tag
Adding a <link> the HTML code for all duplicate pages that point to the canonical page enables you to point an infinite number of pages back to the original. However, this only works for HTML pages, not PDFs, and can be difficult to map across large websites, or sites with dynamic URLs that change often.
2. The rel=canonical HTTP header
Using the HTTP header method helps you avoid increasing the page size, as you will with the rel=canonical tag, and also lets you map to an infinite number of pages. This method can also be difficult to keep track of on sites with thousands or millions of pages and where URLs change often.
3. Using the sitemap
Specifying canonical pages on the sitemap is much easier to track and maintain on large, complex websites. With this method, though, Google sees which is the canonical but must still determine which pages are duplicates of it. Google itself says this is a “less powerful signal to Googlebot than the rel=canonical mapping technique.”
4. 301 redirects
This is a tactic to use only when you are deprecating a duplicate page. Using a 301 redirect tells Googlebot that the redirected version is preferred.
The last tactic is unique to AMP pages and comes with its own guidelines for implementation.
Whichever method you choose, point to the same page consistently to avoid sending mixed signals to Google about which version is your preferred one.
Important Canonicalization Don’ts To Keep In Mind
Google provides a set of general guidelines to help marketers avoid common mistakes with canonicalization. Here’s a quick summary:
- Don’t use the robots.txt file.
- Don’t use the URL removal tool, either, or you’ll remove all versions of the URL from Search.
- Don’t designate two pages the canonical.
- Don’t use noindex to try to prevent Google from making the wrong page the canonical.
A Few More Key Canonicalization Tips
Google prefers the HTTPS version of a page over the HTTP for the canonical, in most cases.
If you have trouble with your canonicalization strategy, you can refer to Google’s Troubleshooting guide for assistance.
Multi-Location Brand Alternatives To Duplicate Content
Google offers a number of technical workarounds and reminders in its Developer documentation to help address duplicate content issues. In addition to better managing your technical SEO, brands can work at increasing the volume of original content over time, as well.
Use a Duplicate Content Checker to Identify Opportunities
Tools like Grammarly, Copyscape, and Siteliner can help scan for and spot duplicate content across multiple domains. If others are duplicating your brand’s content without permission and fail to remove it when requested, you can file a DCMA Takedown Request. You may spot opportunities here to add variety to product descriptions, local pages, and more.
Replace Boilerplate Text On Local Pages & Listings
While a boilerplate business description can get you up and running quickly, hyperlocal content will help each of your locations rank better over time. Google Business Profiles and local pages are both great opportunities to improve search rankings and better convert searchers with rich, unique information about that specific location.
Using a content management system designed specifically for enterprise brands, you can empower local managers to create this rich local content while still retaining brand controls. User permissions and a built-in workflow for original descriptions and other new content ensure your brand-level marketers can assist locations with editorial controls.
Incorporate Location-Specific UGC on Local Pages
Testimonials, customer reviews, and customer photos with descriptive alt text are just a few examples of user-generated content that can be used as original content on your local landing pages. Using the website field in each location’s Google Business Profile, direct users to a local page with rich, compelling information that converts.
Optimizing these pages with UGC can help them achieve a higher search ranking on their own, as well. Working on diversifying each location’s content over time can help combat common duplication issues and improve search engine rankings even in your most competitive markets. That it’s great for user experience, too, is the icing on the cake.
Need help creating custom content for each of your local landing pages to ensure you have the best chance of improving your rankings in search results? Contact Rio SEO today to ensure you’re providing your potential customers with the best search experience possible.