Uncrawled URLs In Search Results

Google makes it a point to comply with the wishes of site owners as contained in the robots.txt file. If they want Googlebot to refrain from crawling a page, then this will be followed. Despite this strict compliance, the company often receives complaints from people who found their sites on search results.

Matt Cutts presents this short lecture to explain what may be happening underneath.
It should be noted that crawling and indexing are two distinct processes. Even though a site has not been crawled by Googlebot, it may still be indexed if there are many places that are linking to it. The site may come up in the search results when the prevalent anchor text is used as the keyword.

If a webmaster wants the site to be taken off the index (and hence never appear in the search results), then he or she should place a “no index” Meta tag on top of every page. An alternative would be to use the URL removal tool to immediately pull out an entire site from the Google index. Either of these two in combination with robots.txt blocking will ensure invisibility from search engines.

Video Link:
Uncrawled URLs in search results –

http://www.youtube.com/watch?v=KBdEwpRQRD0

Uncrawled URLs In Search Results

Submit a Comment Cancel reply

Recent Posts

Recent Comments