• Michał Górny's avatar
    fixes bpo-31834: Use optimized code for BLAKE2 only with SSSE3+ (#4066) · 1aa00ff3
    Michał Górny yazdı
    Rework the code choosing BLAKE2 code paths from using the optimized
    variant on all x86_64 machines to using it when SSSE3 or better
    supported instructions sets are available.
    
    Firstly, this solves the problem of using pure SSE2 code path on x86_64
    machines. As reported in the bug, this code is slower than the reference
    code on all tested x86_64 machines. Furthermore, on Athlon64 that lacks
    SSSE3, it is even 2.5 times slower than the reference code! Checking
    for SSSE3 therefore ensures that the optimized implementation will only
    be used when it has a chance of performing better.
    
    Secondly, this makes it possible to use SSSE3+ optimizations on 32-bit
    x86 systems. This allows for even 2 times speed gain on modern 32-bit
    x86 systems (tested in a 32-bit chroot).
    1aa00ff3
setup.py 97.7 KB