email.generator.rst 13.3 KB
Newer Older
1 2
:mod:`email.generator`: Generating MIME documents
-------------------------------------------------
3 4 5 6

.. module:: email.generator
   :synopsis: Generate flat text email messages from a message structure.

7 8 9
**Source code:** :source:`Lib/email/generator.py`

--------------
10

11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
One of the most common tasks is to generate the flat (serialized) version of
the email message represented by a message object structure.  You will need to
do this if you want to send your message via :meth:`smtplib.SMTP.sendmail` or
the :mod:`nntplib` module, or print the message on the console.  Taking a
message object structure and producing a serialized representation is the job
of the generator classes.

As with the :mod:`email.parser` module, you aren't limited to the functionality
of the bundled generator; you could write one from scratch yourself.  However
the bundled generator knows how to generate most email in a standards-compliant
way, should handle MIME and non-MIME email messages just fine, and is designed
so that the bytes-oriented parsing and generation operations are inverses,
assuming the same non-transforming :mod:`~email.policy` is used for both.  That
is, parsing the serialized byte stream via the
:class:`~email.parser.BytesParser` class and then regenerating the serialized
byte stream using :class:`BytesGenerator` should produce output identical to
the input [#]_.  (On the other hand, using the generator on an
:class:`~email.message.EmailMessage` constructed by program may result in
changes to the :class:`~email.message.EmailMessage` object as defaults are
filled in.)

The :class:`Generator` class can be used to flatten a message into a text (as
opposed to binary) serialized representation, but since Unicode cannot
represent binary data directly, the message is of necessity transformed into
something that contains only ASCII characters, using the standard email RFC
Content Transfer Encoding techniques for encoding email messages for transport
over channels that are not "8 bit clean".


.. class:: BytesGenerator(outfp, mangle_from_=None, maxheaderlen=None, *, \
                          policy=None)
42

43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
   Return a :class:`BytesGenerator` object that will write any message provided
   to the :meth:`flatten` method, or any surrogateescape encoded text provided
   to the :meth:`write` method, to the :term:`file-like object` *outfp*.
   *outfp* must support a ``write`` method that accepts binary data.

   If optional *mangle_from_* is ``True``, put a ``>`` character in front of
   any line in the body that starts with the exact string ``"From "``, that is
   ``From`` followed by a space at the beginning of a line.  *mangle_from_*
   defaults to the value of the :attr:`~email.policy.Policy.mangle_from_`
   setting of the *policy* (which is ``True`` for the
   :data:`~email.policy.compat32` policy and ``False`` for all others).
   *mangle_from_* is intended for use when messages are stored in unix mbox
   format (see :mod:`mailbox` and `WHY THE CONTENT-LENGTH FORMAT IS BAD
   <http://www.jwz.org/doc/content-length.html>`_).

   If *maxheaderlen* is not ``None``, refold any header lines that are longer
   than *maxheaderlen*, or if ``0``, do not rewrap any headers.  If
   *manheaderlen* is ``None`` (the default), wrap headers and other message
   lines according to the *policy* settings.

   If *policy* is specified, use that policy to control message generation.  If
   *policy* is ``None`` (the default), use the policy associated with the
   :class:`~email.message.Message` or :class:`~email.message.EmailMessage`
   object passed to ``flatten`` to control the message generation.  See
   :mod:`email.policy` for details on what *policy* controls.
68

69
   .. versionadded:: 3.2
70

71
   .. versionchanged:: 3.3 Added the *policy* keyword.
72

73 74
   .. versionchanged:: 3.6 The default behavior of the *mangle_from_*
      and *maxheaderlen* parameters is to follow the policy.
75 76


77
   .. method:: flatten(msg, unixfrom=False, linesep=None)
78

79 80 81 82 83 84 85 86 87 88 89 90
      Print the textual representation of the message object structure rooted
      at *msg* to the output file specified when the :class:`BytesGenerator`
      instance was created.

      If the :mod:`~email.policy` option :attr:`~email.policy.Policy.cte_type`
      is ``8bit`` (the default), copy any headers in the original parsed
      message that have not been modified to the output with any bytes with the
      high bit set reproduced as in the original, and preserve the non-ASCII
      :mailheader:`Content-Transfer-Encoding` of any body parts that have them.
      If ``cte_type`` is ``7bit``, convert the bytes with the high bit set as
      needed using an ASCII-compatible :mailheader:`Content-Transfer-Encoding`.
      That is, transform parts with non-ASCII
91 92
      :mailheader:`Content-Transfer-Encoding`
      (:mailheader:`Content-Transfer-Encoding: 8bit`) to an ASCII compatible
93 94 95 96 97 98 99 100 101 102 103
      :mailheader:`Content-Transfer-Encoding`, and encode RFC-invalid non-ASCII
      bytes in headers using the MIME ``unknown-8bit`` character set, thus
      rendering them RFC-compliant.

      .. XXX: There should be an option that just does the RFC
         compliance transformation on headers but leaves CTE 8bit parts alone.

      If *unixfrom* is ``True``, print the envelope header delimiter used by
      the Unix mailbox format (see :mod:`mailbox`) before the first of the
      :rfc:`5322` headers of the root message object.  If the root object has
      no envelope header, craft a standard one.  The default is ``False``.
104
      Note that for subparts, no envelope header is ever printed.
105

106 107 108
      If *linesep* is not ``None``, use it as the separator character between
      all the lines of the flattened message.  If *linesep* is ``None`` (the
      default), use the value specified in the *policy*.
109

110
      .. XXX: flatten should take a *policy* keyword.
111 112


113
   .. method:: clone(fp)
114

115 116
      Return an independent clone of this :class:`BytesGenerator` instance with
      the exact same option settings, and *fp* as the new *outfp*.
117 118


119
   .. method:: write(s)
120

121 122 123
      Encode *s* using the ``ASCII`` codec and the ``surrogateescape`` error
      handler, and pass it to the *write* method of the *outfp* passed to the
      :class:`BytesGenerator`'s constructor.
124 125


126 127 128 129 130
As a convenience, :class:`~email.message.EmailMessage` provides the methods
:meth:`~email.message.EmailMessage.as_bytes` and ``bytes(aMessage)`` (a.k.a.
:meth:`~email.message.EmailMessage.__bytes__`), which simplify the generation of
a serialized binary representation of a message object.  For more detail, see
:mod:`email.message`.
131

132

133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168
Because strings cannot represent binary data, the :class:`Generator` class must
convert any binary data in any message it flattens to an ASCII compatible
format, by converting them to an ASCII compatible
:mailheader:`Content-Transfer_Encoding`.  Using the terminology of the email
RFCs, you can think of this as :class:`Generator` serializing to an I/O stream
that is not "8 bit clean".  In other words, most applications will want
to be using :class:`BytesGenerator`, and not :class:`Generator`.

.. class:: Generator(outfp, mangle_from_=None, maxheaderlen=None, *, \
                     policy=None)

   Return a :class:`Generator` object that will write any message provided
   to the :meth:`flatten` method, or any text provided to the :meth:`write`
   method, to the :term:`file-like object` *outfp*.  *outfp* must support a
   ``write`` method that accepts string data.

   If optional *mangle_from_* is ``True``, put a ``>`` character in front of
   any line in the body that starts with the exact string ``"From "``, that is
   ``From`` followed by a space at the beginning of a line.  *mangle_from_*
   defaults to the value of the :attr:`~email.policy.Policy.mangle_from_`
   setting of the *policy* (which is ``True`` for the
   :data:`~email.policy.compat32` policy and ``False`` for all others).
   *mangle_from_* is intended for use when messages are stored in unix mbox
   format (see :mod:`mailbox` and `WHY THE CONTENT-LENGTH FORMAT IS BAD
   <http://www.jwz.org/doc/content-length.html>`_).

   If *maxheaderlen* is not ``None``, refold any header lines that are longer
   than *maxheaderlen*, or if ``0``, do not rewrap any headers.  If
   *manheaderlen* is ``None`` (the default), wrap headers and other message
   lines according to the *policy* settings.

   If *policy* is specified, use that policy to control message generation.  If
   *policy* is ``None`` (the default), use the policy associated with the
   :class:`~email.message.Message` or :class:`~email.message.EmailMessage`
   object passed to ``flatten`` to control the message generation.  See
   :mod:`email.policy` for details on what *policy* controls.
169 170 171

   .. versionchanged:: 3.3 Added the *policy* keyword.

172 173
   .. versionchanged:: 3.6 The default behavior of the *mangle_from_*
      and *maxheaderlen* parameters is to follow the policy.
174 175


176
   .. method:: flatten(msg, unixfrom=False, linesep=None)
177 178

      Print the textual representation of the message object structure rooted
179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196
      at *msg* to the output file specified when the :class:`Generator`
      instance was created.

      If the :mod:`~email.policy` option :attr:`~email.policy.Policy.cte_type`
      is ``8bit``, generate the message as if the option were set to ``7bit``.
      (This is required because strings cannot represent non-ASCII bytes.)
      Convert any bytes with the high bit set as needed using an
      ASCII-compatible :mailheader:`Content-Transfer-Encoding`.  That is,
      transform parts with non-ASCII :mailheader:`Cotnent-Transfer-Encoding`
      (:mailheader:`Content-Transfer-Encoding: 8bit`) to an ASCII compatibile
      :mailheader:`Content-Transfer-Encoding`, and encode RFC-invalid non-ASCII
      bytes in headers using the MIME ``unknown-8bit`` character set, thus
      rendering them RFC-compliant.

      If *unixfrom* is ``True``, print the envelope header delimiter used by
      the Unix mailbox format (see :mod:`mailbox`) before the first of the
      :rfc:`5322` headers of the root message object.  If the root object has
      no envelope header, craft a standard one.  The default is ``False``.
197 198
      Note that for subparts, no envelope header is ever printed.

199 200 201 202 203 204 205 206 207 208
      If *linesep* is not ``None``, use it as the separator character between
      all the lines of the flattened message.  If *linesep* is ``None`` (the
      default), use the value specified in the *policy*.

      .. XXX: flatten should take a *policy* keyword.

      .. versionchanged:: 3.2
         Added support for re-encoding ``8bit`` message bodies, and the
         *linesep* argument.

209 210 211

   .. method:: clone(fp)

212 213 214
      Return an independent clone of this :class:`Generator` instance with the
      exact same options, and *fp* as the new *outfp*.

215 216

   .. method:: write(s)
217

218 219 220 221
      Write *s* to the *write* method of the *outfp* passed to the
      :class:`Generator`'s constructor.  This provides just enough file-like
      API for :class:`Generator` instances to be used in the :func:`print`
      function.
222 223


224 225 226 227 228 229
As a convenience, :class:`~email.message.EmailMessage` provides the methods
:meth:`~email.message.EmailMessage.as_string` and ``str(aMessage)`` (a.k.a.
:meth:`~email.message.EmailMessage.__str__`), which simplify the generation of
a formatted string representation of a message object.  For more detail, see
:mod:`email.message`.

230

231 232 233 234 235
The :mod:`email.generator` module also provides a derived class,
:class:`DecodedGenerator`, which is like the :class:`Generator` base class,
except that non-\ :mimetype:`text` parts are not serialized, but are instead
represented in the output stream by a string derived from a template filled
in with information about the part.
236

237 238
.. class:: DecodedGenerator(outfp, mangle_from_=None, maxheaderlen=None, \
                            fmt=None, *, policy=None)
239

240 241 242 243 244 245
   Act like :class:`Generator`, except that for any subpart of the message
   passed to :meth:`Generator.flatten`, if the subpart is of main type
   :mimetype:`text`, print the decoded payload of the subpart, and if the main
   type is not :mimetype:`text`, instead of printing it fill in the string
   *fmt* using information from the part and print the resulting
   filled-in string.
246

247 248
   To fill in *fmt*, execute ``fmt % part_info``, where ``part_info``
   is a dictionary composed of the following keys and values:
249 250 251 252 253 254 255 256 257 258 259 260 261

   * ``type`` -- Full MIME type of the non-\ :mimetype:`text` part

   * ``maintype`` -- Main MIME type of the non-\ :mimetype:`text` part

   * ``subtype`` -- Sub-MIME type of the non-\ :mimetype:`text` part

   * ``filename`` -- Filename of the non-\ :mimetype:`text` part

   * ``description`` -- Description associated with the non-\ :mimetype:`text` part

   * ``encoding`` -- Content transfer encoding of the non-\ :mimetype:`text` part

262 263 264
   If *fmt* is ``None``, use the following default *fmt*:

      "[Non-text (%(type)s) part of message omitted, filename %(filename)s]"
265

266
   Optional *_mangle_from_* and *maxheaderlen* are as with the
267
   :class:`Generator` base class.
268 269 270 271


.. rubric:: Footnotes

272 273 274 275 276 277 278 279
.. [#] This statement assumes that you use the appropriate setting for
       ``unixfrom``, and that there are no :mod:`policy` settings calling for
       automatic adjustments (for example,
       :attr:`~email.policy.Policy.refold_source` must be ``none``, which is
       *not* the default).  It is also not 100% true, since if the message
       does not conform to the RFC standards occasionally information about the
       exact original text is lost during parsing error recovery.  It is a goal
       to fix these latter edge cases when possible.