You probably know that your website should always contain original content. If your site contains duplicate content, it is a huge mistake that can hurt your site ranking and your reputation. Plagiarism, or passing someone else’s work off as your own without permission, is unacceptable both online and offline.
It’s not that difficult to take an existing article from a website, switch a few words around, and palm it off as your own work. In fact, there are article-spinning tools designed to do just that (although not with perfect results – one that I tried while researching this article changed “eyes green with envy” to “eyes inexperienced with envy”).
If you publish content on your website that has already been published elsewhere (or has enough similarities to existing web content), you could find your site penalized by search engines. And that would defeat the point of writing that content in the first place.
How is Duplicate Content Defined?
Duplicate content is content that appears on more than one online location, meaning different websites. If you publish your own content in more than one place, you have duplicate content. If you copy someone else’s content onto your site or if they publish yours on their site, that’s duplicate content.
Google defines duplicate content in their guidelines as such: “Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin.”
There are two main ways you can end up with web content that is not unique:
- You publish content that has been copied from another website
- Someone else copies your content after you publish it
I’m not going to go into great detail on how to avoid the second point, although one of the tools listed below can help you detect this.
Using Google to check for Duplicate Content
One quick way to check if a page may be considered a duplicate is by copying around ten words from the start of a sentence and then pasting it with quotes into Google. This is actually Google’s recommended way to check.
If you test this for a page on your website, you would expect to see only your webpage to show up and ideally with no other results.
If other websites show as well as your site, Google hints that it thinks the original source is the result it shows first. If this isn’t your website, you may have a duplicate content issue.
How much original information should appear on the page in order to be considered unique?
Even with no duplicate content “penalty,” it is widely accepted that Google rewards quality, uniqueness, and the signals associated with adding value. Meanwhile, a critical component of cost-effective SEO is creating pages that are seen as unique but that also leverage existing content. This brings up a justified question: How much original information should appear on the page in order to be considered unique? It seems there needs to be at least a 50/50 ratio for a page to be determined unique. Regardless of how many words are used (i.e. 100 words unique and 100 words duplicate are considered unique content whereas 400 words unique and 800 duplicate – duplicate) the ratio appears to be the deciding factor.
What is acceptable use of other content?
A genuine reason for hits to appear in a copy checker is that you’re using content curation to add extra content to your site that you didn’t have to write. This is perfectly acceptable, if done correctly, and an excellent way to build your search engine rankings. But, again, certain sections of your curated articles will show up in a copy checker.Luckily, there’s no need to panic or fire your content writer. Search engines aren’t looking for the odd phrase or quote to come up. They’re more bothered about whole chunks of identical text and clear signs that an article has been copied from another website.
Here are some good free tools that can be used to check for duplicate content:
Copyscape – This tool can quickly check the content that you have written against already published content in a matter of seconds. The comparison tool will highlight content that shows up as duplicate, and it will let you know what percentage of your content matches already-published content.
Plagspotter – This tool can identify duplicate pages of content across the web. It’s a great tool for finding plagiarists who have stolen your content. It also allows you to automatically monitor your URLs on a weekly basis to identify duplicate content.
Duplichecker – This tool quickly checks the originality of the content you are planning to post on your site. Registered users can do up to 50 searches per day.
Plagiarism Detector – You can paste in up to a thousand words in Plagiarism Detector and check them without needing to pay, which is probably good enough for many people. Even if you have more writing than that to check, you can simply paste in the next section, and the next, until you’re done.
For professionals who need to do this often, the premium option will let you check up to twenty-five thousand words at once.
Grammarly – As well as being an excellent grammar checker, Grammarly also offers a free plagiarism checker, along with a premium version that will suggest where, and who, you need to credit if you’ve accidentally quoted someone else.
Siteliner – This is a great tool that can check your entire site once a month for duplicate content. It can also check for broken links and identifies pages that are most prominent to search engines.
Smallseotools – A variety of SEO tools are available, including a plagiarism checker that identifies fragments of identical content.