Handling User-Generated & Manufacturer-Required Duplicate Content Across Large Numbers of URLs

Posted by randfish

We know that Google tends to penalize duplicate content, especially when it’s something that’s found in exactly the same form on thousands of URLs across the web. So how, then, do we deal with things like product descriptions, when the manufacturers require us to display things in exactly the same way as other companies?

In today’s Whiteboard Friday, Rand offers three ways for marketers to include that content while minimizing the risk of a penalty.

Manufacturer-Required Duplicate Content Across Large Numbers of URLs – Whiteboard Friday

For reference, here’s a still of this week’s whiteboard!

Video Transcription

Howdy Moz fans, and welcome to another edition of Whiteboard Friday. Today I’m going to be chatting a little bit about a very specific particular problem that a lot of e-commerce shops, travel kinds of websites, places that host user-generated and user-review types of content experience with regards to duplicate content.

So what happens, basically, is you get a page like this. I’m at BMO’s Travel Gadgets. It’s a great website where I can pick up all sorts of travel supplies and gear. The BMO camera 9000 is an interesting one because the camera’s manufacturer requires that all websites which display the camera contain a lot of the same information. They want the manufacturer’s description. They have specific photographs that they’d like you to use of the product. They might even have user reviews that come with those.

Because of this, a lot of the folks, a lot of the e-commerce sites who post this content find that they’re getting trapped in duplicate content filters. Google is not identifying their content as being particularly unique. So they’re sort of getting relegated to the back of the index, not ranking particularly well. They may even experience problems like Google Panda, which identifies a lot of this content and says, “Gosh, we’ve seen this all over the web and thousands of their pages, because they have thousands of products, are all exactly the same as thousands of other websites’ other products.”

So the challenge becomes: How do they stay unique? How do they stand out from this crowd, and how can they deal with these duplicate content issues?

Of course, this doesn’t just apply to a travel gadget shop. It applies broadly to the e-commerce category, but also to categories where content licensing happens a lot. So you could imagine that user reviews of, for example, things like rental properties or hotels or car rentals or flights or all sorts of things related to many, many different kinds of verticals could have this same type of issue.

But there are some ways around it. It’s not a huge list of options, but there are some. Number one, you can essentially say, “Hey, I’m going to create so much unique content, all of this stuff that I’ve marked here in green. I’m going to do some test results with the camera, different photographs. I’m going to do a comparison between this one and other ones. I’m going to do some specs that maybe aren’t included by the manufacturer. I’ll have my own BMO’s editorial review and maybe some reviews that come from BMO customers in particular.” That could work great in order to differentiate that page.

Some of the time you don’t need that much unique content in order to be considered valuable and unique enough to get out of a Panda problem or a duplicate content issue. However, do be careful not to go way overboard with this. I’ve seen a lot of SEOs do this where they essentially say, “Okay, you know what? We’re just going to hire some relatively low quality, cheap writers.” Maybe English isn’t even their first language or the country of whatever country you’re trying to target, that language is not their first language, and they write a lot of content that just all sits below the fold here. It’s really junky. It’s not useful to anyone. The only reason they’re doing it is to try and get around a duplicate content filter. I definitely don’t recommend this. Panda is built even more to handle that type of problem than this one, from Google’s perspective anyway.

Number two, if you have some unique content, but you have a significant amount of content that you know is duplicate and you feel is still useful to the user, you want to put it on that page, you can use iframes to keep it kind of out of the engine’s index, or at least not associated with this particular URL. If I’ve got this page here and I say, “Gosh, you know, I do want to put these user reviews, but they’re the same as a bunch of other places on the web, or maybe they’re duplicates of stuff that happened on other pages of my site.” I’m going to take this, and I’m going to build a little iframe, put it around here, embed the iframe on the page, but that doesn’t mean that this content is perceived to be a part of this URL. It’s coming from it’s own separate URL, maybe over here, and that can also work.

Number three, you can take content which is largely duplicative and apply aggregation, visualization, or modifications to that duplicate content in order to build something unique and valuable and new that can rank well. My favorite example of this is what a lot of movie review sites, or review sites of all kinds, like Metacritic and Rotten Tomatoes do, where they’re essentially aggregating up review data, and all of the snippets, all of the quotes are coming from all of these different places on the web. So it’s essentially a bunch of different duplicates, but because they’re the aggregator of all of these unique, useful pieces of content and because they provide their own things like a metascore or a Rotten Tomatoes rating, or an editorial review of their own, it becomes something more. The combination of these duplicative pieces of content becomes more than the sum of its parts, and Google recognizes that and wants to keep it in their index.

These are all options. Then the last recommendation that I have is when you’re going through this process, especially if you have a large amount of content that you’re already launching with, start with those pages that matter the most. So you could go down a list of the most popular items in your database, the things that you know people are searching for the most, the things that you know you have sold the most of or the internal searches have led to those pages the most; great, start with those pages. Try and take care of them from a uniqueness and value standpoint, and you can even, if you want, especially if you’re launching with a large amount of new content all at once, you can take these duplicative pages and keep them out of the index until you’ve gone through that modification process. Now you sort of go, “All right, this week we got these 10 pages done. Boom, let’s make them indexable. Then next week we’re going to do 20, and then the week after that we’ll get faster. We’ll do 50, 100, and soon we’ll have our entire 10,000 product page catalog finish and completed, all with unique, useful, valuable information that will get us into Google’s index and stop us from being considered duplicate content.”

All right everyone, hope you’ve enjoyed this edition of Whiteboard Friday. We’ll see you again next week. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Google Q4 2013 Earnings Report $16.86 Billion In Revenues

Google announced fourth-quarter 2013 earnings today. Google’s revenue targets beat estimates, while their earnings-per-share did not. Google had revenues of $16.86 billion for the quarter ended December 31, 2013, an increase of 17% compared to the fourth quarter of 2012. Google’s…

Please visit Search Engine Land for the full article.

Yahoo Adds Local Business Snapshots Next To Search Results

You’re gonna say to yourself, “That looks a lot like Google’s local search results.” I’m talking about the latest interface change that Yahoo has announced: the display of local business information alongside search results. The local business info includes basic…

Please visit Search Engine Land for the full article.

Search Marketing Expo – SMX West Early Bird Rates Expire Saturday. Register Now!

Join the most accomplished internet marketers in the world at SMX West, March 11-13 in San Jose, CA. Check out the agenda & speaker roster, featuring three days of tactic packed sessions, a keynote from Google, invaluable networking opportunities, and much more. Save $200 with Early Bird rates…

Please visit Search Engine Land for the full article.

Anatomy of a Local SEO Campaign

Barbie Doll Stuck Up Butt X RaySource: The Internet Client Information Intake Research > Strategy Recommendations Site Audit > Upgrade NAP/Citation Audit > Fix Issues On Site Content Creation Off Site Content Creation Review Generation Link Research > Link Outreach Clean Up Link SPAM From Previous SEO Firms Rinse Repeat  

The post Anatomy of a Local SEO Campaign appeared first on Local SEO Guide.

From 10 Blue Links To Entity SERPs: Is Your Website Ready?

Search is changing and along with it the landscape of search results. SERPs are more adaptive, more engaging, more informative, more interactive and more personalized. The adoption of Semantic Search- and Semantic Web-related enhanced displays in SERPs…

Seven on-page SEO tips for Baidu

1. Use the correct language

The first step to SEO success on Baidu is selecting the correct language. On the face of it this sounds simple, but there are several different dialects within China and the same word can be represented by several characters.

To circumnavigate these issues Baidu solely indexes simplified Chinese characters.

This is an advantage for English speaking companies’ websites hoping to rise in the Baidu rankings because, when you’re translating your site, there is no guessing as to which characters to include or which dialects to focus on.

2. Choose a suitable domain name

A suitable domain name is a crucial aspect of any website no matter the country or search engine being targeted.

Baidu has several recommendations for domain names including making it as short as possible and memorable.

Other advice provided by the Chinese search engine includes choosing a domain name that generates a sense of trust, using well-known suffixes such as .cn or com.cn. Analysis shows that internet users find a .com domain name is much easier to remember, eg. Apple in China: www.apple.com/cn. It is not a compulsory requirement to register a .cn domain name.

3. Site structure

Your site should have a clear structure and navigation which helps users quickly find what they are looking for, while at the same time allowing search engines to quickly understand your site.

Baidu recommends a tree-like site structure with a flat architecture where possible as this allows the search engines to find your content in as few clicks possible. No more than four clicks to the deepest level is recommended.

The most important pages should be found in the top levels of the site and it is important to ensure that each page can be reached through at least one text link.

Baidu’s crawlers and algorithm are not as advanced as Google and so rely heavily on the on-page information provided by the web pages of a site.

In addition to this, due to poor connectivity in second and third tier cities where desktop access is less common Baidu’s crawlers will often simply crawl only the first 100kb to 120kb of content on a page.

Consequently, a site needs to be fast, mobile optimised and easy to crawl as it will be accessed through a 3G connection.

Baidu also indexes far fewer pages than Google. As of last year Google had 48bn pages in its index, while Baidu had just over 800m pages. This means that a website with thousands of pages indexed in Google, might only have a few hundred pages indexed in Baidu.

Therefore, with Baidu it’s more important than ever to get your technical and on-page SEO right. As a result of Baidu not indexing as many pages as Google, this will ensure that your most important pages are indexed.

4. Avoiding crawling restrictions

As I’ve mentioned, Baidu’s crawlers are not as advanced as Google’s and therefore it’s imperative you avoid certain types of web content in order to be properly optimised in China. Badiu can only read text content and can’t read Javascript and Ajax content or links.

Therefore it’s recommended that you avoid excessive image or Flash use, JavaScript links and use regular text content instead of image files where possible.

Badiu’s crawlers will also ignore frames or iFrames. If you really must use Flash web pages, make a text version for search engines and put a link in the homepage for this.

5. Title tags

Your title tag should be attractive with a clear message and should contain the most important keywords.

When splitting up the keywords Baidu recommends an underscore (_) as a good separator. This is in direct contrast to Google and Bing which recommend a hyphen (-). A common example of underscore usage in well-known Chinese websites are:

Sina.com

Tech.HuanQiu.com

Recommended Title Tag Format

6. Meta Description Tag

Although the Meta description is not a ranking factor, Baidu highlights your search query with red text rather than bold text as used in Google which makes keyword heavy entries a lot more attention grabbing.

So getting keywords in your title and description will help to increase CTR.

You can check the CTR of a page using Baidu Tuiguang (www2.baidu.com) or Baidu Editor which is available for any companies that have a Baidu account in the ‘account brief info’ menu.

7. Subdomain

Baidu treats subdomains differently to Google and Bing. It doesn’t measure subdomains as totally separate sites with different ranking metrics. Instead, it treats subdomains more like an organised subcategory.

Therefore they inherit the root domain’s authority meaning a subdomain’s authority is directly related to the main root domain.

A couple of points to bear in mind on subdomains:

  • Don’t have too many.
  • Don’t share the same content between two separate subdomains as Baidu will recognise this as duplicated. 

As it’s Chinese New Year this Friday, why not get to know the country’s search engine and ecommerce market with our infographic below. 

Last Chance for SMX West Early Bird Rates – Register Now!

Search Marketing Expo – SMX West rates increase Saturday, February 1st. Register now and save on all passes! Join us March 11-13 in San Jose for: The latest tactics and strategies: Over 75% of the sessions are new. Check out the agenda. Presentations by more than 80 internet marketing experts…

Please visit Search Engine Land for the full article.

I Don’t Want No Scrub (Agency)

You know the song, so sing it with me people! I made some modifications to the lyrics below that demonstrate my point well: I don’t want no scrub (paid search practitioner) A scrub (PPC ) is a guy (PPC person) that can’t get no love from me Hanging out the passenger side Of his best…

Please visit Search Engine Land for the full article.

More Brands Lose Brand Rankings As Google Takes Action – Music Magpie Just One Casualty

With the release of the new data in SearchMetrics.com we can confirm that there are a number of people that will be suffering from a huge headache this week following the loss of their own brand search terms, a clear indication that they have received a penalisation from Google. With

The post More Brands Lose Brand Rankings As Google Takes Action – Music Magpie Just One Casualty appeared first on SEO Blog by Dave Naylor – SEO Tools, Tips & News.