• Raymond Hettinger's avatar
    Issue 21469: Mitigate risk of false positives with robotparser. · 122541be
    Raymond Hettinger yazdı
    * Repair the broken link to norobots-rfc.txt.
    
    * HTTP response codes >= 500 treated as a failed read rather than as a not
    found.  Not found means that we can assume the entire site is allowed.  A 5xx
    server error tells us nothing.
    
    * A successful read() or parse() updates the mtime (which is defined to be "the
      time the robots.txt file was last fetched").
    
    * The can_fetch() method returns False unless we've had a read() with a 2xx or
    4xx response.  This avoids false positives in the case where a user calls
    can_fetch() before calling read().
    
    * I don't see any easy way to test this patch without hitting internet
    resources that might change or without use of mock objects that wouldn't
    provide must reassurance.
    122541be
Adı
Son kayıt (commit)
Son güncelleme
Doc Loading commit data...
Grammar Loading commit data...
Include Loading commit data...
Lib Loading commit data...
Mac Loading commit data...
Misc Loading commit data...
Modules Loading commit data...
Objects Loading commit data...
PC Loading commit data...
PCbuild Loading commit data...
Parser Loading commit data...
Python Loading commit data...
Tools Loading commit data...
.bzrignore Loading commit data...
.gitignore Loading commit data...
.hgeol Loading commit data...
.hgignore Loading commit data...
.hgtags Loading commit data...
.hgtouch Loading commit data...
LICENSE Loading commit data...
Makefile.pre.in Loading commit data...
README Loading commit data...
config.guess Loading commit data...
config.sub Loading commit data...
configure Loading commit data...
configure.ac Loading commit data...
install-sh Loading commit data...
pyconfig.h.in Loading commit data...
setup.py Loading commit data...