• Guido van Rossum's avatar
    Minimal fix for the complaints about pickling Unicode objects. (SF · fb10c3f6
    Guido van Rossum yazdı
    bugs #126161 and 123634).
    
    The solution doesn't use the unicode-escape encoding; that has other
    problems (it seems not 100% reversible).  Rather, it transforms the
    input Unicode object slightly before encoding it using
    raw-unicode-escape, so that the decoding will reconstruct the original
    string: backslash and newline characters are translated into their
    \uXXXX counterparts.
    
    This is backwards incompatible for strings containing backslashes, but
    for some of those strings, the pickling was already broken.
    
    Note that SF bug #123634 complains specifically that cPickle fails to
    unpickle the pickle for u'' (the empty Unicode string) correctly.
    This was an off-by-one error in load_unicode().
    
    XXX Ugliness: in order to do the modified raw-unicode-escape, I've
    cut-and-pasted a copy of PyUnicode_EncodeRawUnicodeEscape() into this
    file that also encodes '\\' and '\n'.  It might be nice to migrate
    this into the Unicode implementation and give this encoding a new name
    ('half-raw-unicode-escape'? 'pickle-unicode-escape'?); that would help
    pickle.py too.  But right now I can't be bothered with the necessary
    infrastructural changes.
    fb10c3f6
cPickle.c 108 KB