• Ezio Melotti's avatar
    Update PyUnicode_DecodeUTF8 from RFC 2279 to RFC 3629. · e57e50c8
    Ezio Melotti yazdı
    1) #8271: when a byte sequence is invalid, only the start byte and all the
       valid continuation bytes are now replaced by U+FFFD, instead of replacing
       the number of bytes specified by the start byte.
       See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95);
    2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes
       in behavior);
    3) Add code and tests to reject surrogates (U+D800-U+DFFF) as defined in
       RFC 3629, but leave it commented out since it's not backward compatible;
    4) Change the error messages "unexpected code byte" to "invalid start byte"
       and "invalid data" to "invalid continuation byte";
    5) Add an extensive set of tests in test_unicode;
    6) Fix test_codeccallbacks because it was failing after this change.
    e57e50c8
Adı
Son kayıt (commit)
Son güncelleme
..
stringlib Loading commit data...
abstract.c Loading commit data...
boolobject.c Loading commit data...
bufferobject.c Loading commit data...
bytearrayobject.c Loading commit data...
bytes_methods.c Loading commit data...
capsule.c Loading commit data...
cellobject.c Loading commit data...
classobject.c Loading commit data...
cobject.c Loading commit data...
codeobject.c Loading commit data...
complexobject.c Loading commit data...
descrobject.c Loading commit data...
dictnotes.txt Loading commit data...
dictobject.c Loading commit data...
enumobject.c Loading commit data...
exceptions.c Loading commit data...
fileobject.c Loading commit data...
floatobject.c Loading commit data...
frameobject.c Loading commit data...
funcobject.c Loading commit data...
genobject.c Loading commit data...
intobject.c Loading commit data...
iterobject.c Loading commit data...
listobject.c Loading commit data...
listsort.txt Loading commit data...
lnotab_notes.txt Loading commit data...
longobject.c Loading commit data...
memoryobject.c Loading commit data...
methodobject.c Loading commit data...
moduleobject.c Loading commit data...
object.c Loading commit data...
obmalloc.c Loading commit data...
rangeobject.c Loading commit data...
setobject.c Loading commit data...
sliceobject.c Loading commit data...
stringobject.c Loading commit data...
structseq.c Loading commit data...
tupleobject.c Loading commit data...
typeobject.c Loading commit data...
unicodectype.c Loading commit data...
unicodeobject.c Loading commit data...
unicodetype_db.h Loading commit data...
weakrefobject.c Loading commit data...