What is Duplicate Content?
"Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar." Source: Official Google Webmaster Central Blog
The Search Engines are in the business of providing relevant, unique content in Search results. In order to do this they apply algorithms that determine those sites that are duplicating content either within or across domains. Essentially, when a Search Engine robot crawls a website, it reads the pages, and stores the information in its database. Then, it compares its findings to other information it has in its database. It then determines where duplicate content occurs, and filters out the offending pages.
Most of the time duplication is unintentional and non-malicious; something as simple as having both an html and a printer friendly version of a page can trip the filter. There are ways of avoiding the filter in cases like this such as blocking the Search Engine spiders from indexing one or other version.
Search Engine Spam, on the other hand, is regarded as any attempt to deliberately manipulate Search Engines rankings and often results in the return of inappropriate, poor quality search results and replicated content. It is this replicated content that has necessitated the use of these filters by the Search Engines.
What should you do to avoid duplicating content?
Quite simply, avoid duplicating any of your content across your website, making sure that each page is as unique as it can possibly be. If you own multiple websites with a similar theme, ensure that each site has distinctly different content to the next.
There are numerous tools available online that allow you to check the similarity of pages and also tools that allow you to see other websites that may have "borrowed" or scraped your content. These offending sites can be reported to Google allowing you to claim ownership of the content.