Kaydet (Commit) ee3074e1 authored tarafından Martin Panter's avatar Martin Panter

Issue #22088: Clarify base-64 alphabets and which characters are discarded

* There are only two base-64 alphabets defined by the RFCs, not three
* Due to the internal translation, plus (+) and slash (/) are never discarded
* standard_ and urlsafe_b64decode() discard characters as well

Also update the doc strings to clarify data types, based on revision
92760d2edc9e, correct the exception raised by b16decode(), and correct the
parameter name for the base-85 functions.
üst e1d4e587
......@@ -24,8 +24,8 @@ POST request. The encoding algorithm is not the same as the
There are two interfaces provided by this module. The modern interface
supports encoding :term:`bytes-like objects <bytes-like object>` to ASCII
:class:`bytes`, and decoding :term:`bytes-like objects <bytes-like object>` or
strings containing ASCII to :class:`bytes`. All three :rfc:`3548` defined
alphabets (normal, URL-safe, and filesystem-safe) are supported.
strings containing ASCII to :class:`bytes`. Both base-64 alphabets
defined in :rfc:`3548` (normal, and URL- and filesystem-safe) are supported.
The legacy interface does not support decoding from strings, but it does
provide functions for encoding and decoding to and from :term:`file objects
......@@ -69,9 +69,10 @@ The modern interface provides:
A :exc:`binascii.Error` exception is raised
if *s* is incorrectly padded.
If *validate* is ``False`` (the default), non-base64-alphabet characters are
If *validate* is ``False`` (the default), characters that are neither
in the normal base-64 alphabet nor the alternative alphabet are
discarded prior to the padding check. If *validate* is ``True``,
non-base64-alphabet characters in the input result in a
these non-alphabet characters in the input result in a
:exc:`binascii.Error`.
......@@ -89,7 +90,8 @@ The modern interface provides:
.. function:: urlsafe_b64encode(s)
Encode :term:`bytes-like object` *s* using a URL-safe alphabet, which
Encode :term:`bytes-like object` *s* using the
URL- and filesystem-safe alphabet, which
substitutes ``-`` instead of ``+`` and ``_`` instead of ``/`` in the
standard Base64 alphabet, and return the encoded :class:`bytes`. The result
can still contain ``=``.
......@@ -97,7 +99,8 @@ The modern interface provides:
.. function:: urlsafe_b64decode(s)
Decode :term:`bytes-like object` or ASCII string *s* using a URL-safe
Decode :term:`bytes-like object` or ASCII string *s*
using the URL- and filesystem-safe
alphabet, which substitutes ``-`` instead of ``+`` and ``_`` instead of
``/`` in the standard Base64 alphabet, and return the decoded
:class:`bytes`.
......@@ -145,14 +148,14 @@ The modern interface provides:
lowercase alphabet is acceptable as input. For security purposes, the default
is ``False``.
A :exc:`TypeError` is raised if *s* is
A :exc:`binascii.Error` is raised if *s* is
incorrectly padded or if there are non-alphabet characters present in the
input.
.. function:: a85encode(s, *, foldspaces=False, wrapcol=0, pad=False, adobe=False)
.. function:: a85encode(b, *, foldspaces=False, wrapcol=0, pad=False, adobe=False)
Encode the :term:`bytes-like object` *s* using Ascii85 and return the
Encode the :term:`bytes-like object` *b* using Ascii85 and return the
encoded :class:`bytes`.
*foldspaces* is an optional flag that uses the special short sequence 'y'
......@@ -172,9 +175,9 @@ The modern interface provides:
.. versionadded:: 3.4
.. function:: a85decode(s, *, foldspaces=False, adobe=False, ignorechars=b' \\t\\n\\r\\v')
.. function:: a85decode(b, *, foldspaces=False, adobe=False, ignorechars=b' \\t\\n\\r\\v')
Decode the Ascii85 encoded :term:`bytes-like object` or ASCII string *s* and
Decode the Ascii85 encoded :term:`bytes-like object` or ASCII string *b* and
return the decoded :class:`bytes`.
*foldspaces* is a flag that specifies whether the 'y' short sequence
......@@ -192,9 +195,9 @@ The modern interface provides:
.. versionadded:: 3.4
.. function:: b85encode(s, pad=False)
.. function:: b85encode(b, pad=False)
Encode the :term:`bytes-like object` *s* using base85 (as used in e.g.
Encode the :term:`bytes-like object` *b* using base85 (as used in e.g.
git-style binary diffs) and return the encoded :class:`bytes`.
If *pad* is true, the input is padded with ``b'\0'`` so its length is a
......
This diff is collapsed.
......@@ -243,14 +243,26 @@ class BaseXYTestCase(unittest.TestCase):
(b'@@', b''),
(b'!', b''),
(b'YWJj\nYWI=', b'abcab'))
funcs = (
base64.b64decode,
base64.standard_b64decode,
base64.urlsafe_b64decode,
)
for bstr, res in tests:
self.assertEqual(base64.b64decode(bstr), res)
self.assertEqual(base64.b64decode(bstr.decode('ascii')), res)
for func in funcs:
with self.subTest(bstr=bstr, func=func):
self.assertEqual(func(bstr), res)
self.assertEqual(func(bstr.decode('ascii')), res)
with self.assertRaises(binascii.Error):
base64.b64decode(bstr, validate=True)
with self.assertRaises(binascii.Error):
base64.b64decode(bstr.decode('ascii'), validate=True)
# Normal alphabet characters not discarded when alternative given
res = b'\xFB\xEF\xBE\xFF\xFF\xFF'
self.assertEqual(base64.b64decode(b'++[[//]]', b'[]'), res)
self.assertEqual(base64.urlsafe_b64decode(b'++--//__'), res)
def test_b32encode(self):
eq = self.assertEqual
eq(base64.b32encode(b''), b'')
......@@ -360,6 +372,10 @@ class BaseXYTestCase(unittest.TestCase):
b'\x01\x02\xab\xcd\xef')
eq(base64.b16decode(array('B', b"0102abcdef"), True),
b'\x01\x02\xab\xcd\xef')
# Non-alphabet characters
self.assertRaises(binascii.Error, base64.b16decode, '0102AG')
# Incorrect "padding"
self.assertRaises(binascii.Error, base64.b16decode, '010')
def test_a85encode(self):
eq = self.assertEqual
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment