• Victor Stinner's avatar
    bpo-29240: PEP 540: Add a new UTF-8 Mode (#855) · 91106cd9
    Victor Stinner yazdı
    * Add -X utf8 command line option, PYTHONUTF8 environment variable
      and a new sys.flags.utf8_mode flag.
    * If the LC_CTYPE locale is "C" at startup: enable automatically the
      UTF-8 mode.
    * Add _winapi.GetACP(). encodings._alias_mbcs() now calls
      _winapi.GetACP() to get the ANSI code page
    * locale.getpreferredencoding() now returns 'UTF-8' in the UTF-8
      mode. As a side effect, open() now uses the UTF-8 encoding by
      default in this mode.
    * Py_DecodeLocale() and Py_EncodeLocale() now use the UTF-8 encoding
      in the UTF-8 Mode.
    * Update subprocess._args_from_interpreter_flags() to handle -X utf8
    * Skip some tests relying on the current locale if the UTF-8 mode is
      enabled.
    * Add test_utf8mode.py.
    * _Py_DecodeUTF8_surrogateescape() gets a new optional parameter to
      return also the length (number of wide characters).
    * pymain_get_global_config() and pymain_set_global_config() now
      always copy flag values, rather than only copying if the new value
      is greater than the old value.
    91106cd9
_bootlocale.py 1.76 KB
"""A minimal subset of the locale module used at interpreter startup
(imported by the _io module), in order to reduce startup time.

Don't import directly from third-party code; use the `locale` module instead!
"""

import sys
import _locale

if sys.platform.startswith("win"):
    def getpreferredencoding(do_setlocale=True):
        if sys.flags.utf8_mode:
            return 'UTF-8'
        return _locale._getdefaultlocale()[1]
else:
    try:
        _locale.CODESET
    except AttributeError:
        if hasattr(sys, 'getandroidapilevel'):
            # On Android langinfo.h and CODESET are missing, and UTF-8 is
            # always used in mbstowcs() and wcstombs().
            def getpreferredencoding(do_setlocale=True):
                return 'UTF-8'
        else:
            def getpreferredencoding(do_setlocale=True):
                if sys.flags.utf8_mode:
                    return 'UTF-8'
                # This path for legacy systems needs the more complex
                # getdefaultlocale() function, import the full locale module.
                import locale
                return locale.getpreferredencoding(do_setlocale)
    else:
        def getpreferredencoding(do_setlocale=True):
            assert not do_setlocale
            if sys.flags.utf8_mode:
                return 'UTF-8'
            result = _locale.nl_langinfo(_locale.CODESET)
            if not result and sys.platform == 'darwin':
                # nl_langinfo can return an empty string
                # when the setting has an invalid value.
                # Default to UTF-8 in that case because
                # UTF-8 is the default charset on OSX and
                # returning nothing will crash the
                # interpreter.
                result = 'UTF-8'
            return result