Subscribe to Marketing Jive
 Subscribe to our feed.
Marketing-Jive, formerly SEO-Space, was established in 2006 and since then we have noticed significant increases in both traffic and feed subscribers. If you want to promote your business to thousands of visitors who understand digital marketing, you’ve come to the right place. Sign up and start receiving qualified leads right now. Your ad will be visible on every unique page on our blog.
Glossary of SEO Terms
  • SEO Terms A-C
  • SEO Terms D-F
  • SEO Terms G-I
  • SEO Terms J-L
  • SEO Terms M-O
  • SEO Terms P-S
  • SEO Terms T-V
  • SEO Terms W-Z
  • Enquiro's Online Marketing Glossary
Search Engine Market Share (US)
Organic Search / SEO Tips
Robots Exclusion Protocol Sets Standars for Yahoo, Google and Microsoft
Thursday, June 05, 2008
As users, search marketers and Webmasters, you have to love it when the search engines collaborate to release a defined protocol as an industry standard. They did this a couple of years ago with the standardized protocol for XML sitemaps. On Tuesday, it was announced that the Robots Exclusion Protocol (REP) was being launched to help make webmasters' efforts more effective across the major search engines.

The Robots Exclusion Protocol is a standard that lets content publishers and website owners specify which parts of their site they want public and which parts they want to keep private from the search engine robots, either by controlling the visibility of their content across their site via robots.txt file or at the page level of individual pages via META tags.

On the official Yahoo Search Blog, Yahoo stated:
Since we've never detailed the specifics of implementing the protocol, today we're releasing detailed documentation on how REP directives will be handled by the three major search providers.
There were three directives pertaining to the REP protocol set up including:
  • Common Robots.txt Directives such as
    Disallow Tells a crawler not to crawl your site or parts of your site -- your site's robots.txt still needs to be crawled to find this directive, but the disallowed pages will not be crawled. 'No crawl' pages from a site. This directive in the default syntax prevents specific path(s) of a site from crawling.
    Allow Tells a crawler the specific pages on your site you want indexed so you can use this in combination with Disallow. If both Disallow and Allow clauses apply to a URL, the most specific rule - the longest rule - applies. This is useful in particular in conjunction with Disallow clauses, where a large section of a site is disallowed, except a small section within it.
    $ Wildcard Support Tells a crawler to match everything from the end of a URL -- large number of directories without specifying specific pages. 'No Crawl' files with specific patterns, for eg., files with certain filetypes that always have a certain extension, say pdf; etc.
    Sitemap Location Tells a crawler where it can find your sitemaps. Point to other locations where feeds exist to point the crawlers to the site's content

    • Meta Tag Directives - such as

    NOINDEX META Tag Tells a crawler not to index a given page. Don't index the page. This allows pages that are crawled to be kept out of the index.
    NOFOLLOW META Tag Tells a crawler not to follow a link to other content on a given page. Prevent publicly writeable areas from being abused by spammers looking for link credit. By NOFOLLOW, you let the robot know that you are discounting all outgoing links from this page.
    NOSNIPPET META Tag Tells a crawler not to display snippets in the search results for a given page. Present no abstract for the page on search results.
    NOARCHIVE META Tag Tells a search engine not to show a "cached" link for a given page. Do not make a copy of the page available to users from the search engine cache.
    NOODP META Tag Tells a crawler not to use a title and snippet from the Open Directory Project for a given page. Do not use the ODP (Open Directory Project) title and abstract for this page in Search.

    • Yahoo Specific Directives - such as Crawl-Delay: Allows a site to delay the frequency with which a crawler checks for new content

      NOYDIR META Tag: This is similar to the NOODP META Tag above but applies to the Yahoo! Directory, instead of the Open Directory Project

      Robots-nocontent Tag: Allows you to identify the main content of your page so that our crawler targets the right pages on your site for specific search queries by marking out non content parts of your page. We won't use the sections tagged as such for indexing the page or for the abstract in the search results.
You can also find information on the Official Google Webmaster Central Blog or the Microsoft Live Search Webmaster Center Blog.

Labels: ,

posted by Jody @ Thursday, June 05, 2008  
    Post a Comment
    << Home
    Top B2B Blogs   
    Invesp landing page optimization
    About Me
    Name: Jody
    Home: Kelowna, BC, Canada
    About Me: SEO guy by day, family man 24/7.
    Previous Posts
    Marketing Jive Vault of Posts
    Online Marketing Resources
    • Content Marketing & Website Analysis
    • Digital Marketing Services
    • Optimizing for Blended Search
    • Search Engine Guide
    • WebProNews Canada
    • Official Google Blog
    • Yahoo Search Blog
    • Search Engine Watch
    • 100% Organic
    • Global Thoughtz
    • B2B Marketing Blogs
    • Silicon Valley Gateway
    • Guy Kawasaki
    • Church of the Customer Blog
    • Marketo's Big List of B2B Blogs
    Blogs We Like
    Hockey Fanatic
    Ask.com Blog
    Comparison Engines
    Matt Cutts

    Yahoo Search Blog

    Add to Technorati Favorites

    Marketing Jive Home


    Subscribe | | Advertise | Site Map

    Add to GoogleAdd to My Yahoo!Add to BloglinesAdd to NetvibesAdd to Windows Live