Kaydet (Commit) 758bca6e authored tarafından Alexandre Vassalotti's avatar Alexandre Vassalotti

Improve pickle's documentation.

There is still much to be done, but I am committing my changes
incrementally to avoid losing them again (for a third time now).
üst 87eee631
...@@ -92,11 +92,9 @@ advantage that there are no restrictions imposed by external standards such as ...@@ -92,11 +92,9 @@ advantage that there are no restrictions imposed by external standards such as
XDR (which can't represent pointer sharing); however it means that non-Python XDR (which can't represent pointer sharing); however it means that non-Python
programs may not be able to reconstruct pickled Python objects. programs may not be able to reconstruct pickled Python objects.
By default, the :mod:`pickle` data format uses a printable ASCII representation. By default, the :mod:`pickle` data format uses a compact binary representation.
This is slightly more voluminous than a binary representation. The big The module :mod:`pickletools` contains tools for analyzing data streams
advantage of using printable ASCII (and of some other characteristics of generated by :mod:`pickle`.
:mod:`pickle`'s representation) is that for debugging or recovery purposes it is
possible for a human to read the pickled file with a standard text editor.
There are currently 4 different protocols which can be used for pickling. There are currently 4 different protocols which can be used for pickling.
...@@ -110,17 +108,15 @@ There are currently 4 different protocols which can be used for pickling. ...@@ -110,17 +108,15 @@ There are currently 4 different protocols which can be used for pickling.
efficient pickling of :term:`new-style class`\es. efficient pickling of :term:`new-style class`\es.
* Protocol version 3 was added in Python 3.0. It has explicit support for * Protocol version 3 was added in Python 3.0. It has explicit support for
bytes and cannot be unpickled by Python 2.x pickle modules. bytes and cannot be unpickled by Python 2.x pickle modules. This is
the current recommended protocol, use it whenever it is possible.
Refer to :pep:`307` for more information. Refer to :pep:`307` for more information.
If a *protocol* is not specified, protocol 3 is used. If *protocol* is If a *protocol* is not specified, protocol 3 is used. If *protocol* is
specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
protocol version available will be used. protocol version available will be used.
A binary format, which is slightly more efficient, can be chosen by specifying a
*protocol* version >= 1.
Usage Usage
----- -----
...@@ -146,152 +142,210 @@ an unpickler, then you call the unpickler's :meth:`load` method. The ...@@ -146,152 +142,210 @@ an unpickler, then you call the unpickler's :meth:`load` method. The
as line terminators and therefore will look "funny" when viewed in Notepad or as line terminators and therefore will look "funny" when viewed in Notepad or
other editors which do not support this format. other editors which do not support this format.
.. data:: DEFAULT_PROTOCOL
The default protocol used for pickling. May be less than HIGHEST_PROTOCOL.
Currently the default protocol is 3; a backward-incompatible protocol
designed for Python 3.0.
The :mod:`pickle` module provides the following functions to make the pickling The :mod:`pickle` module provides the following functions to make the pickling
process more convenient: process more convenient:
.. function:: dump(obj, file[, protocol]) .. function:: dump(obj, file[, protocol])
Write a pickled representation of *obj* to the open file object *file*. This is Write a pickled representation of *obj* to the open file object *file*. This
equivalent to ``Pickler(file, protocol).dump(obj)``. is equivalent to ``Pickler(file, protocol).dump(obj)``.
If the *protocol* parameter is omitted, protocol 3 is used. If *protocol* is The optional *protocol* argument tells the pickler to use the given protocol;
specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest supported protocols are 0, 1, 2, 3. The default protocol is 3; a
protocol version will be used. backward-incompatible protocol designed for Python 3.0.
*file* must have a :meth:`write` method that accepts a single string argument. Specifying a negative protocol version selects the highest protocol version
It can thus be a file object opened for writing, a :mod:`StringIO` object, or supported. The higher the protocol used, the more recent the version of
any other custom object that meets this interface. Python needed to read the pickle produced.
The *file* argument must have a write() method that accepts a single bytes
argument. It can thus be a file object opened for binary writing, a
io.BytesIO instance, or any other custom object that meets this interface.
.. function:: load(file) .. function:: dumps(obj[, protocol])
Read a string from the open file object *file* and interpret it as a pickle data Return the pickled representation of the object as a :class:`bytes`
stream, reconstructing and returning the original object hierarchy. This is object, instead of writing it to a file.
equivalent to ``Unpickler(file).load()``.
*file* must have two methods, a :meth:`read` method that takes an integer The optional *protocol* argument tells the pickler to use the given protocol;
argument, and a :meth:`readline` method that requires no arguments. Both supported protocols are 0, 1, 2, 3. The default protocol is 3; a
methods should return a string. Thus *file* can be a file object opened for backward-incompatible protocol designed for Python 3.0.
reading, a :mod:`StringIO` object, or any other custom object that meets this
interface.
This function automatically determines whether the data stream was written in Specifying a negative protocol version selects the highest protocol version
binary mode or not. supported. The higher the protocol used, the more recent the version of
Python needed to read the pickle produced.
.. function:: load(file, [\*, encoding="ASCII", errors="strict"])
.. function:: dumps(obj[, protocol]) Read a pickled object representation from the open file object *file* and
return the reconstituted object hierarchy specified therein. This is
equivalent to ``Unpickler(file).load()``.
Return the pickled representation of the object as a :class:`bytes` The protocol version of the pickle is detected automatically, so no protocol
object, instead of writing it to a file. argument is needed. Bytes past the pickled object's representation are
ignored.
If the *protocol* parameter is omitted, protocol 3 is used. If *protocol* The argument *file* must have two methods, a read() method that takes an
is specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest integer argument, and a readline() method that requires no arguments. Both
protocol version will be used. methods should return bytes. Thus *file* can be a binary file object opened
for reading, a BytesIO object, or any other custom object that meets this
interface.
Optional keyword arguments are encoding and errors, which are used to decode
8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
'strict', respectively.
.. function:: loads(bytes_object) .. function:: loads(bytes_object, [\*, encoding="ASCII", errors="strict"])
Read a pickled object hierarchy from a :class:`bytes` object. Read a pickled object hierarchy from a :class:`bytes` object and return the
Bytes past the pickled object's representation are ignored. reconstituted object hierarchy specified therein
The :mod:`pickle` module also defines three exceptions: The protocol version of the pickle is detected automatically, so no protocol
argument is needed. Bytes past the pickled object's representation are
ignored.
Optional keyword arguments are encoding and errors, which are used to decode
8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
'strict', respectively.
The :mod:`pickle` module defines three exceptions:
.. exception:: PickleError .. exception:: PickleError
A common base class for the other exceptions defined below. This inherits from Common base class for the other pickling exceptions. It inherits
:exc:`Exception`. :exc:`Exception`.
.. exception:: PicklingError .. exception:: PicklingError
This exception is raised when an unpicklable object is passed to the Error raised when an unpicklable object is encountered by :class:`Pickler`.
:meth:`dump` method. It inherits :exc:`PickleError`.
.. exception:: UnpicklingError .. exception:: UnpicklingError
This exception is raised when there is a problem unpickling an object. Note that Error raised when there a problem unpickling an object, such as a data
other exceptions may also be raised during unpickling, including (but not corruption or a security violation. It inherits :exc:`PickleError`.
necessarily limited to) :exc:`AttributeError`, :exc:`EOFError`,
:exc:`ImportError`, and :exc:`IndexError`.
The :mod:`pickle` module also exports two callables, :class:`Pickler` and Note that other exceptions may also be raised during unpickling, including
:class:`Unpickler`: (but not necessarily limited to) AttributeError, EOFError, ImportError, and
IndexError.
.. class:: Pickler(file[, protocol]) The :mod:`pickle` module exports two classes, :class:`Pickler` and
:class:`Unpickler`:
This takes a file-like object to which it will write a pickle data stream. .. class:: Pickler(file[, protocol])
If the *protocol* parameter is omitted, protocol 3 is used. If *protocol* is This takes a binary file for writing a pickle data stream.
specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
protocol version will be used.
*file* must have a :meth:`write` method that accepts a single string argument. The optional *protocol* argument tells the pickler to use the given protocol;
It can thus be an open file object, a :mod:`StringIO` object, or any other supported protocols are 0, 1, 2, 3. The default protocol is 3; a
custom object that meets this interface. backward-incompatible protocol designed for Python 3.0.
:class:`Pickler` objects define one (or two) public methods: Specifying a negative protocol version selects the highest protocol version
supported. The higher the protocol used, the more recent the version of
Python needed to read the pickle produced.
The *file* argument must have a write() method that accepts a single bytes
argument. It can thus be a file object opened for binary writing, a
io.BytesIO instance, or any other custom object that meets this interface.
.. method:: dump(obj) .. method:: dump(obj)
Write a pickled representation of *obj* to the open file object given in the Write a pickled representation of *obj* to the open file object given in
constructor. Either the binary or ASCII format will be used, depending on the the constructor.
value of the *protocol* argument passed to the constructor.
.. method:: persistent_id(obj)
Do nothing by default. This exists so a subclass can override it.
If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
other value causes :class:`Pickler` to emit the returned value as a
persistent ID for *obj*. The meaning of this persistent ID should be
defined by :meth:`Unpickler.persistent_load`. Note that the value
returned by :meth:`persistent_id` cannot itself have a persistent ID.
See :ref:`pickle-persistent` for details and examples of uses.
.. method:: clear_memo() .. method:: clear_memo()
Clears the pickler's "memo". The memo is the data structure that remembers Deprecated. Use the :meth:`clear` method on the :attr:`memo`. Clear the
which objects the pickler has already seen, so that shared or recursive objects pickler's memo, useful when reusing picklers.
pickled by reference and not by value. This method is useful when re-using
picklers. .. attribute:: fast
Enable fast mode if set to a true value. The fast mode disables the usage
of memo, therefore speeding the pickling process by not generating
superfluous PUT opcodes. It should not be used with self-referential
objects, doing otherwise will cause :class:`Pickler` to recurse
infinitely.
Use :func:`pickletools.optimize` if you need more compact pickles.
.. attribute:: memo
Dictionary holding previously pickled objects to allow shared or
recursive objects to pickled by reference as opposed to by value.
It is possible to make multiple calls to the :meth:`dump` method of the same It is possible to make multiple calls to the :meth:`dump` method of the same
:class:`Pickler` instance. These must then be matched to the same number of :class:`Pickler` instance. These must then be matched to the same number of
calls to the :meth:`load` method of the corresponding :class:`Unpickler` calls to the :meth:`load` method of the corresponding :class:`Unpickler`
instance. If the same object is pickled by multiple :meth:`dump` calls, the instance. If the same object is pickled by multiple :meth:`dump` calls, the
:meth:`load` will all yield references to the same object. [#]_ :meth:`load` will all yield references to the same object.
:class:`Unpickler` objects are defined as: Please note, this is intended for pickling multiple objects without intervening
modifications to the objects or their parts. If you modify an object and then
pickle it again using the same :class:`Pickler` instance, the object is not
pickled again --- a reference to it is pickled and the :class:`Unpickler` will
return the old value, not the modified one.
.. class:: Unpickler(file) .. class:: Unpickler(file, [\*, encoding="ASCII", errors="strict"])
This takes a file-like object from which it will read a pickle data stream. This takes a binary file for reading a pickle data stream.
This class automatically determines whether the data stream was written in
binary mode or not, so it does not need a flag as in the :class:`Pickler`
factory.
*file* must have two methods, a :meth:`read` method that takes an integer The protocol version of the pickle is detected automatically, so no
argument, and a :meth:`readline` method that requires no arguments. Both protocol argument is needed.
methods should return a string. Thus *file* can be a file object opened for
reading, a :mod:`StringIO` object, or any other custom object that meets this
interface.
:class:`Unpickler` objects have one (or two) public methods: The argument *file* must have two methods, a read() method that takes an
integer argument, and a readline() method that requires no arguments. Both
methods should return bytes. Thus *file* can be a binary file object opened
for reading, a BytesIO object, or any other custom object that meets this
interface.
Optional keyword arguments are encoding and errors, which are used to decode
8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
'strict', respectively.
.. method:: load() .. method:: load()
Read a pickled object representation from the open file object given in Read a pickled object representation from the open file object given in
the constructor, and return the reconstituted object hierarchy specified the constructor, and return the reconstituted object hierarchy specified
therein. therein. Bytes past the pickled object's representation are ignored.
This method automatically determines whether the data stream was written .. method:: persistent_load(pid)
in binary mode or not.
Raise an :exc:`UnpickingError` by default.
.. method:: noload() If defined, :meth:`persistent_load` should return the object specified by
the persistent ID *pid*. On errors, such as if an invalid persistent ID is
encountered, an :exc:`UnpickingError` should be raised.
This is just like :meth:`load` except that it doesn't actually create any See :ref:`pickle-persistent` for details and examples of uses.
objects. This is useful primarily for finding what's called "persistent
ids" that may be referenced in a pickle data stream. See section .. method:: find_class(module, name)
:ref:`pickle-protocol` below for more details.
Import *module* if necessary and return the object called *name* from it.
Subclasses may override this to gain control over what type of objects can
be loaded, potentially reducing security risks.
What can be pickled and unpickled? What can be pickled and unpickled?
...@@ -506,6 +560,8 @@ The registered constructor is deemed a "safe constructor" for purposes of ...@@ -506,6 +560,8 @@ The registered constructor is deemed a "safe constructor" for purposes of
unpickling as described above. unpickling as described above.
.. _pickle-persistent:
Pickling and unpickling external objects Pickling and unpickling external objects
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...@@ -747,14 +803,6 @@ the same process or a new process. :: ...@@ -747,14 +803,6 @@ the same process or a new process. ::
.. [#] Don't confuse this with the :mod:`marshal` module .. [#] Don't confuse this with the :mod:`marshal` module
.. [#] *Warning*: this is intended for pickling multiple objects without intervening
modifications to the objects or their parts. If you modify an object and then
pickle it again using the same :class:`Pickler` instance, the object is not
pickled again --- a reference to it is pickled and the :class:`Unpickler` will
return the old value, not the modified one. There are two problems here: (1)
detecting changes, and (2) marshalling a minimal set of changes. Garbage
Collection may also become a problem here.
.. [#] The exception raised will likely be an :exc:`ImportError` or an .. [#] The exception raised will likely be an :exc:`ImportError` or an
:exc:`AttributeError` but it could be something else. :exc:`AttributeError` but it could be something else.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment