Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is that kind of thing safe, though? With all probability, their robots.txt and user agreement disallows it. So if they detect your spider, they might shut down your account?

Not a good solution...



From my work with web web bots, I've seen that it's really so easy to make a spider that they can't positively identify as such.

If you guys really wanted this then I guess I could show you how...


well why not... Also I guess the one way to stay safe is to somehow make it operate slow so for example it wouldn't aggregate 2000 pages in 5 seconds.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: