advertise with ongsono.com
 

Controlling Robots or bot


For some webmasters Google crawls too often (and consumes too much bandwidth). For others it visits too infrequently. Some complain that it doesn’t visit their entire site and others get upset when areas that they didn’t want accessible via search engines appear in the Google index.

To prevent Google from crawling certain pages, the best method is to use a robots.txt file. This is simply an ASCII text file that you place at the root of your domain. For example, if your domain is http://www.yourdomain.com, place the file at http://www.yourdomain.com/robots.txt.

The following robots.txt file would prevent all robots from accessing your image or PERL script directories
User-agent: *
Disallow: /images/
Disallow: /cgi-bin/

The following robots.txt file would all robots accessing all your files and folders
User-agent: *

To control Googlebot’s crawl rate, you need to sign up for Google Webmaster Tools. You can then choose from one of three settings for your crawl: faster, normal, or slower (although sometimes faster is not an available choice). Normal is the default (and recommended) crawl rate. A slower crawl will reduce Googlebot’s traffic on your server, but Google may not be able to crawl your site as often.

Related SEO Tips

Submit Sitemap With Robots.txt
Another way to actively notifying the search engines is thru robots.txt

Using Correct Redirect Pages Method
One more item you should be aware of is the effect of redirect pages on your search engine rankings

Robots Exclusion Standard
A robots exclusion standard was crafted to allow you to tell any robot that you do not want some of your pages indexed