Guidelines on Securing Public Web Servers
A cookie is a small piece of information that may be written to the user's hard drive when he 
or she visits a Web site.  The intent of cookies is to allow servers to recognize a specific 
browser (user).  In essence, they add state to the stateless HTTP protocol.  Unfortunately 
cookies are usually sent in the clear and are stored in the clear on the user's host and so are 
vulnerable to compromise.  There are known vulnerabilities in certain versions of Internet 
Explorer for example that allow a malicious Web site to remotely collect all a visitor's cookies 
without the visitor's knowledge.  Therefore, cookies should never contain data that can be used 
directly by an attacker (e.g., username, password).   
5.2.4  Controlling Web  Bots  Impact on Web Servers 
Web bots (a.k.a., agents or spiders) are software applications used to collect, analyze and index 
Web content.  Web bots are used by a numerous organizations for many purposes.  Some 
examples are as follows: 
    
Scooter, Slurp, and Googlebot slowly and carefully analyze, index, and record Web 
sites for Web search engines such as AltaVista and Google. 
    
ArchitextSpider gathers Internet statistics.  
    
Hyperlink  validators  are used by Webmasters to automatically validate the 
hyperlinks on their Web site.  
    
EmailSiphon and Cherry Picker are bots specifically designed to crawl Web sites for 
electronic mail (e mail) addresses to add to unsolicited advertising e mail ( spam ) 
lists.  These are a common example of a bot that may have a negative impact on a 
Web site or it users.   
Unfortunately, bots can present a challenge to Webmasters and their servers: 
    
Web servers often contain directories that do not need to be indexed.  
    
Organizations might not want part of their site appearing in search engines.  
    
Web servers often contain temporary pages that should not be indexed.  
    
Organizations operating the Web server are paying for bandwidth and want to exclude 
robots and spiders that do not benefit their goals.  
    
Bots are not always well written or well intentioned and can hit a Web site with 
extremely rapid requests, causing a reduction in or outright DoS for legitimate users.  
    
Bots may uncover information that the Webmaster would prefer would remain secret 
or at least unadvertised (e.g., e mail addresses).   
Fortunately, there is a way for Web administrators or the Webmaster to influence the behavior 
of most bots on their Web site.  A series of agreements called the Robots Exclusion Standard 
(REP) has been created.  Although REP is not an official Internet standard, it is supported by 
most well written and well intentioned bots, including those used by most major search 
engines.  
32




  

Home

About Services Network Support FAQ Order Contact
 

Web Hosting SSH

Our partners:Jsp Web Hosting Unlimited Web Hosting Cheapest Web Hosting  Java Web Hosting Web Templates Best Web Templates PHP Mysql Web Hosting Interland Web Hosting Cheap Web Hosting PHP Web Hosting Tomcat Web Hosting Quality Web Hosting Best Web Hosting  Mac Web Hosting 

Lunarwebhost.net  Business web hosting division of Vision Web Hosting Inc. All rights reserved