Sometimes You Have To Hide
By Bud Kraus
bud@joyofcode.com
Joy Of Code
Creator And Instructor
v2 i5
Originally Published: March 9, 2006
This isn't about playing hide and seek but, on second thought, maybe it is.
Much attention is paid to ways of getting found on line - what are the best search engine optimization techniques, how do I get others to link to my pages, and so on. Fortunes are being made by people with successful solutions to the matter of bringing users and customers to web sites.
But what of the opposite need? The need to keep prying eyes away from your pages. Sites and pages under development and/or re-design are things we just rather not share with the world. We need to keep those hungry search engine spiders away as well.
Here are a few strategies to follow when you want to keep everyone - and everything - away from your web site.
1. Use password protection to keep out everyone - and everything.
The most ironclad and failsafe approach is to password protect your pages under construction. This will keep everyone out - unless they're authorized to enter by knowing the user id and password.
This also has the advantage of keeping away search engine spiders, such as the googlebot, and preventing your pages from getting catalogued into search systems. That's very important. Suppose, during site production, you decided to remove a file (or rename it) thereby removing a URL from your site. If that page were spidered, it is possible that someone, after doing a search, would come to a page on your site that no longer exists. You want to avoid this!!
Unless you know how to set user ids and passwords, you'll need some help from your system's administrator or web host since setting these up can't be done with XHTML/HTML.
If you have a live site, the system administrator can set the protection to apply only to a designated subfolder (including all other subfolders and files beneath it). You can have a protected subfolder called "build" (for example) and develop your new site from the same server where your live site resides. This way you'll get to test your new site on the same server where you currently reside. It makes migrating to a live status so much easier.
2. Use meta info.
Another method, not as effective as the above, is to include the following meta tag into the head of your web document:
<meta name="robots" content="nofollow,
noindex" />
or some variation of this idea. I suggest a search to find out more about how robots (search spiders) work.
The idea is to alert the spider not to catalogue a given page nor follow its links to find other pages. A disadvantage is that this code has to be in all of your pages, but if you have a small or mid-size site, that's not a problem. Just remember to remove the code once you want your page(s) to be spidered.
This approach will do nothing to keep people's prying eyes away from pages they might accidentally find.
You won't get me to stand up for how well this kind of approach works. That's why I always go with password protecting pages.
3. Don't put your files online.
This one's almost too easy. If you're unwilling to do one or two of the above, then don't put your files online if you care if:
- anyone sees your work
- spiders crawl through your files and index content that may be removed or renamed before you launch.
If you can live with both of these situations, then go ahead put your files online unprotected but it's not the way I'd go.
