Robots.txt - The SEO Starting Point

An absent robots.txt can force Search Engines see a 404 error page as soon as it attempts to index your site. A 404 error page is a simple message used to allow web users and search engine spiders that the document requested could not be found. In the case of a spider being told that the robots.txt file is unavailable can be important to your sites search engine health.

What is a robots.txt file?

In simple terms a robots.txt file allows search engines, and other crawlers, which part of your site not to spider. this is the first file that a search engine will seek to locate upon visiting your site. This will be done by requesting the file from the root domain for your site. to check you have a robots.txt file as a result you should query your browser for http://www.my-website.com/robots.txt. If this file is absent a 404 error will be created.

Why is returning 404 important for your site?

404s mean that your site is not as stable as it should be. Files are missing that should be available to users the confidence that a search engine has in your domain can be reduced.

Google's mission is to organize the world's information and make it universally accessible and useful.

If your web documents are not available they are not 'universally accessible'.You may be unaware of the level of 404 errors that your site is currently returning to the search engines at present. This can often be due to legacy links or incomplete sites.

For information on how to write a robots.txt file have a look at writing an appropriate robots.txt file for your site

If you are interested in spiderability as a factor when design ing your site you may find useful Sean McManus's Webmaster Resources - Website design tutorials, freeware and javascripts from internet journalist Sean McManus


List of Articles on Ethical Search Engine Optimization


: Hotel Industry Booking Study :: The Horror of Site Submit Pro :: What do you need from Your Site? :
: What is Page Rank? :: Page Rank is Dead - Myth or Reality :: The Replacement for Page Rank? :
: Latent Semantic Indexing :: Using Latent Semantic Indexing :: Robots.txt :
: Writing a robots.txt file :: Server Company Link Request :: Duplicate and Near Duplicate Content :
: Web Site Spiderability :: Big Daddy - the new face of Google :: Page Hijacking and 302 redirects :
: To Submit to Search Engines or not to Submit to Search Engines That is the Question? :: Know Your Customer to Know your User :: Black Hat SEO - Dont Do it! :
: April Fools in Search Engine Land :: Search Engines and Menus :: High Rankings - How do Search Engines fit into Your Business? :
: Google - Da Vinci Code the Game :: Removing the ODP description from your MSN listing :: Viewing the Google index from different Geographic Positions :
: Underused HTML Tags :: Company Law Amendment :

Creative Commons License
This work is licensed under a Creative Commons Attribution-No Derivative Works 2.5 License.