Users browsing this thread: 22 Guest(s)
Rebuilding Process Information Thread
#24
(01-07-2014, 03:50 PM)Raz Wrote: I wrote a scraper but the IP address I was using got banned pretty quickly, even if I randomized the IP/interval between requests Google looks for patterns in the search requests and will block too many similar requests with a CAPTCHA. The funny thing is, the biggest web crawler in the world doesn't let you crawl their servers. Huh.

Good ol' Google. It would be nice if people could just ask them to pass along the cached pages in a handy archive file but I doubt they'd even notice such a request, much less respond to it.
Thanked by:


Messages In This Thread
Rebuilding Process Information Thread - by Dazz - 01-07-2014, 11:44 AM
RE: Rebuilding Process Information Thread - by Phaze - 01-07-2014, 08:55 PM

Forum Jump: