Web Development Article | The Importance of the Robots.txt fileMore Than Articles
Quality Content You Can Use.
[Article ID - 164768] || Word Count: 531 || Total views: 8
Article
The Importance of the Robots.txt file
Rate This Article
Current Rating: Not yet rated
When a search engine sends its webcrawler to your site, one of the first things the webcrawler will do is search the root directory for the robots.txt file. A correctly formated robots.txt file will consist of several records, each providing instructions for a particular search-bot. A record will generally consist of two components, the first is called the user-agent and is where the name of the search-bot is listed. The second line consits of one or more "disallow" lines. These lines tell the webcrawler which files or folders should not be indexed (ie a cgi-bin folder).
If you currently have a website and do not have a robots.txt file, you can create one easily. As mentioned earlier, the files are plain text, so just open up notepad and save the file at robots.txt. Most webmasters can use one record that will apply to all of the search engine crawlers. Once you have opened notepad enter the following:
User-agent: *
Disallow:
The "*" applies this rule to all bots. In this example, there is nothing listed in the disallow line. This tells the robot to index the entire site. You can also enter a folder path here such as "/private" if there is a folder that shouldn't be indexed. This can be very useful if you are still testing a portion of your website or is a section is still under construction.
Now that you know what should go into your robots.txt file, there are several common mistakes people make when creating these files. Never enter notes or comments into the file as these items can cause confusion for the webcrawler. Also, the format should always be the user-agent on the first line, followed by the disallow(s). Do not reverse the order. Another common mistake made involves using the incorrect case. If the disallowed folder is /private, make sure your robots.txt file does not list the folder as /Private. It seems like a very minor issue, but it will cause problems if done incorrectly. Finally, there is no Allow command. You cannot tell the webcrawler what to look at, only what not to look at.
If you are still curious about the robots.txt file you can find many more complex examples online. Just try one of your favorite websites and look for their robots.txt file. For example you can go to http://www.cnn.com/robots.txt. If you need help creating a robots.txt file for your site, there are plenty of places online that will create the file for you for free. One example is http://www.seochat.com/seo-tools/robots-generator/. Despite its apparently simplicity, this file can make or break your site's chances with the search engines. Make sure you have your robots.txt file in place and correctly formatted today.
About the Author
Justin Scarborough founded Profit Program Reviews in order to help others interested in affiliate marketing sort out the valuable information from the many scams out there. He also runs a webmasters website directory at www.thetopweblist.com.Author Profile: profitprogramreviews
Other Web Development Articles
Welcome Guest
Give Your Articles
Use Our Articles
In PDF Ebooks- Publisher Guide
- Advanced Search
- Latest Articles
- Top Articles by Rating
- Top Articles by Views
Information
Categories
- Accounting
- Beauty
- Business
- Career
- Cars and Trucks
- Computers
- Culture and Society
- Environment
- Family
- Finance
- Fitness
- Food and Drink
- Free Tools and Resources
- Health
- Hobbies
- Home
- Humor
- Inspiration and Motivation
- Internet
- - Blogging
- - Broadband
- - Domain Names
- - E-Business
- - New to the Internet
- - Spam
- - Technologies
- - Tools and Resources
- - Web Design
- - Web Development
- - - ASP
- - - CGI
- - - Cold Fusion
- - - CSS
- - - DHTML
- - - HTML
- - - Java
- - - Javascript
- - - Perl
- - - PHP
- - - Scripts
- - - SGML
- - - SMIL
- - - SSL
- - - Templates
- - - XHTML
- - - XML
- - Web Hosting
- - Webmasters
- Internet Marketing
- Legal
- Marketing
- Mens Issues
- Music
- Personal Development
- Pets and Animals
- Politics
- Psychology
- Publishing
- Recreation and Leisure
- Relationships
- Religion and Spirituality
- Science
- Speaking
- Technology
- Womens Issues
- Writing