The single most important thing you can do to make your website a success is to ensure that your content is unique. Generally, this isn’t a problem if you write your own stuff – even if you happen to repeat a phrase that another website used, this doesn’t mean that your content will be picked up as duplicate content. Google has a higher threshold than that. However, if you buy content for your website, then you generally do want to check it out. Here’s what you need to know:
The Duplicate Content Penalty Explained
Most of us know that unique content is important because of the Google duplicate content penalty. This means that duplicate content will not be indexed and will thus be completely worthless to you.
However, what many people are not aware of is that Google doesn’t consider individual phrases to be duplicate content. In fact, one effective (though tedious) method of article spinning involves swapping whole paragraphs in and out in order to create unique content for Google’s search engines.
This means that simply having a phrase, or even a paragraph or two which are duplicated won’t necessarily cause Google’s web crawler not to index your site. However, there is another reason to want to ensure that your content is unique.
Stealing whole paragraphs from someone else will however put you in danger of lawsuits from other websites whose copyrights you infringe. It’s also very unprofessional to do this and so you really do need to ensure that everything you have on your site is 100% unique to you. Therefore, you may want to try one of these three options:
The simplest and cheapest way to check for duplicate content is to use Google itself. Simply take a handful of random sentences from the content and plug them into Google. Do this with a sentence from the beginning, middle and end of your article. The reason I like this is that Google’s system is more sophisticated than something like Copyscape – it will find even sentences which are similar but not quite the same, something Copyscape and other services won’t necessarily find.
By far the best known way to check for unique content is to use Copyscape. This website is designed to allow you to check for duplicate content on each of your pages for free. Or you can also use the system to integrate into your own system and check everything automatically.
However, there are limitations. Copyscape for example will find even a single phrase which is duplicating (this drove me crazy when writing a project for another client of mine and I had to make a list of lottery games offered by various lottery commissions. I had simply copied the list from their sites and gotten hit with a duplicate content report on Copyscape).
Finally, Virante is a site which goes a step further than Copyscape. Theoretically at least, it will scan your entire site for duplicate content, making sure that everything is unique rather than simply scanning a single page at a time. The catch is that while it will tell you that there is a problem, it won’t tell you what the problem is or how to fix it (I think you’re expected to fill out their web form to get a call back with a price quote to help you fix the issues).