Google’s latest search engine ranking algorithm, code named Panda, was updated last Monday, August 20th 2012. They claim this latest update will affect less than 1% of their search results. These updates are happening every month now, the last update to Panda was on July 24th 30 days ago.

Panda is aimed at content scrapers, websites that copy (scrape) other websites content and use it as if it’s their own. Google knows the website that first publishes content and who ever is seen as reusing this will be penalized with this latest update. The first Panda update also targeted and penalized many sites which had duplicate content to other sites.

Here is Google’s definition of duplicate content:
‘Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar.’
The key in that is ‘substantive blocks’, which means their algorithm is looking for paragraphs or pages of content that are very similar.

Google’s #1 aim for their search engine is to deliver the best results to the users search query and they are rewarding fresh, unique high-quality content and showing that higher in the search results.

Here are some typical problems which can lead to duplicate content issues that people don’t think about on their website:

  1. Ecommerce sites often use manufacturers product descriptions and are being reproduced in multiple websites
  2. Mirrored sites. I have seen lots of car sites that have signed up to a company to manage their content, but each one is using the same content! Websites with the .com and .ca using the same content.
  3. Printable pages that remove the header and navigation, but keep the same content.
  4. Same page can be found using multiple urls:
    http://www.example.com
    https://www.example.com
    http://www.example.com/index.htm
  5. Larger sites suffer from using content management systems that use the same title, meta descriptions, headings, navigation and text that is shared globally.
  6. City pages where sites want to rank for different local areas and use the same content, but only change the city name.
  7. Pages that serve multiple data variables through URLs:
    DeptID=469
    &CatID=29841
    &CatTyp=DEP
    &ItemTyp=G
    &GrpTyp=SIZ
    The user can view the page in multiple ways, but a search engine nightmare.

How do you keep the right side of Panda?

Our advice is to ask your thought leaders and writers to focus on developing high-quality content which is information rich and useful to your readers. If you don’t have social sharing buttons on your website and blog, add them today. Now your readers can share your content socially which will increase it’s authority and ranking in the search engines.

If you have any concerns regarding duplicate content, here are some tools to help you detect if your website is copying content from other sources:

Copyscape
A free service. You have to paste your webpage url to check if it has been copied.

They have two paid services:
Premium – that will check if the content is unique before you publish it. 5c per search.
Copysentry – monitors the web regularly for plagiarism. Starting at $4.95 a month

Plagium
Allows you to paste up to 25,000 characters and they will check the web, news and social networks (Beta) and is free.

CopyGator
Is also free and has some great tools. They will monitor your RSS feed to find if your content has been republished and notify you when your new post of is duplicated, quoted or plagiarized. They even have a badge that you can add to your website that will turn red when your content has been duplicated.