Search engines will look in your root domain for a special file named "robots.txt" (http://www.mydomain.com/robots.txt). The file tells the robot (spider) which files it may spider (download). This system is called, The Robots Exclusion Standard.