Crawler peep show

If you run your own website, you are used to find a lot of bots in the log of your server. On some websites, the traffic from the bots and crawler amounts to more than 60 or 75 percent of all the traffic. This raises the questions: What are those bots doing on my page?
Some of them are nice and pleasing. The bots of the search engine keep the search results on Google and Bing up to date. A lot of crawlers are on a spam mission and keep looking for mail addresses or forms to fill with their crap.
I recently came across Googles Webmaster Tools. They reveal some of the activities the bots conduct. Even on this very site of no particular importance, Google comes by every day and takes a look at 500 pages in average, about 13 megabytes each day.
Under Google Labs you see a simulation of the Google bot’s behavior: Kind of a crawler peep show. And you get some hints about problematic descriptions or meta tags. You’ll find some useful tips how to optimize the website structure and HTML code. Most of those can be adapted in zero time.
I particularly like the graph of the performance of your website. This can prove useful in discussions with the hoster, when they pretend that it is only your impression that your site often responds terribly slowly.
Google gives some inside information as well: The most popular queries, site links internal links and keywording. And this information you get without the need to plant any Java script stuff on your site, as you have to do when you want to get the stats from Google analytics. To get to the stats, register your website at the webmaster tools. You need to add a code to the meta tags of your website in order to identify yourself as owner. You can as well upload a file to the root directory of your server, which is the easier way if you use any rewrite rules in .htaccess.

Autor: Matthias

Computerjournalist, Familienvater, Radiomensch und Podcaster, Nerd, Blogger und Skeptiker. Überzeugungstäter, was das Bloggen angeht – und Verfechter eines freien, offenen Internets, in dem nicht alle interessanten Inhalte in den Datensilos von ein paar grossen Internetkonzernen verschwinden. Wenn euch das Blog hier gefällt, dürft ihr mir gerne ein Bier oder einen Tee spendieren:

Kommentar verfassen