• Marc-André Lemburg's avatar
    Fix to the UTF-8 encoder: it failed on 0-length input strings. · bd3be8f0
    Marc-André Lemburg yazdı
    Fix for the UTF-8 decoder: it will now accept isolated surrogates
    (previously it raised an exception which causes round-trips to
    fail).
    
    Added new tests for UTF-8 round-trip safety (we rely on UTF-8 for
    marshalling Unicode objects, so we better make sure it works for
    all Unicode code points, including isolated surrogates).
    
    Bumped the PYC magic in a non-standard way -- please review. This
    was needed because the old PYC format used illegal UTF-8 sequences
    for isolated high surrogates which now raise an exception.
    bd3be8f0
test_unicodedata 155 Bytes