I’m often asked how to make a spider to crawl the web, or a specific site, a directory, look for hubs, and so on. They’re not hard to make and there are a ton load of them out there. There is really no need to reinvent the wheel and write a load of other ones that are all the same. It’s bad engineering, always reuse code. The following presentation is one I use a lot, and it will give you a lot of places to get code, tutorials, information and so on. I also dish out some basic advice.
I hope it’s useful, lots of love, cj