email.generator.rst 11.4 KB
Newer Older
1 2
:mod:`email.generator`: Generating MIME documents
-------------------------------------------------
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

.. module:: email.generator
   :synopsis: Generate flat text email messages from a message structure.


One of the most common tasks is to generate the flat text of the email message
represented by a message object structure.  You will need to do this if you want
to send your message via the :mod:`smtplib` module or the :mod:`nntplib` module,
or print the message on the console.  Taking a message object structure and
producing a flat text document is the job of the :class:`Generator` class.

Again, as with the :mod:`email.parser` module, you aren't limited to the
functionality of the bundled generator; you could write one from scratch
yourself.  However the bundled generator knows how to generate most email in a
standards-compliant way, should handle MIME and non-MIME email messages just
fine, and is designed so that the transformation from flat text, to a message
19
structure via the :class:`~email.parser.Parser` class, and back to flat text,
20
is idempotent (the input is identical to the output) [#]_.  On the other hand,
R David Murray's avatar
R David Murray committed
21 22 23
using the Generator on a :class:`~email.message.Message` constructed by program
may result in changes to the :class:`~email.message.Message` object as defaults
are filled in.
24

25 26 27 28 29 30
:class:`bytes` output can be generated using the :class:`BytesGenerator` class.
If the message object structure contains non-ASCII bytes, this generator's
:meth:`~BytesGenerator.flatten` method will emit the original bytes.  Parsing a
binary message and then flattening it with :class:`BytesGenerator` should be
idempotent for standards compliant messages.

31 32 33 34
Here are the public methods of the :class:`Generator` class, imported from the
:mod:`email.generator` module:


35
.. class:: Generator(outfp, mangle_from_=True, maxheaderlen=78, *, policy=None)
36

37 38 39
   The constructor for the :class:`Generator` class takes a :term:`file-like object`
   called *outfp* for an argument.  *outfp* must support the :meth:`write` method
   and be usable as the output file for the :func:`print` function.
40 41 42 43 44 45 46 47 48 49 50 51 52

   Optional *mangle_from_* is a flag that, when ``True``, puts a ``>`` character in
   front of any line in the body that starts exactly as ``From``, i.e. ``From``
   followed by a space at the beginning of the line.  This is the only guaranteed
   portable way to avoid having such lines be mistaken for a Unix mailbox format
   envelope header separator (see `WHY THE CONTENT-LENGTH FORMAT IS BAD
   <http://www.jwz.org/doc/content-length.html>`_ for details).  *mangle_from_*
   defaults to ``True``, but you might want to set this to ``False`` if you are not
   writing Unix mailbox format files.

   Optional *maxheaderlen* specifies the longest length for a non-continued header.
   When a header line is longer than *maxheaderlen* (in characters, with tabs
   expanded to 8 spaces), the header will be split as defined in the
53 54
   :class:`~email.header.Header` class.  Set to zero to disable header wrapping.
   The default is 78, as recommended (but not required) by :rfc:`2822`.
55

56
   The *policy* keyword specifies a :mod:`~email.policy` object that controls a
57
   number of aspects of the generator's operation.  If no *policy* is specified,
58
   then the *policy* attached to the message object passed to :attr:`flatten`
59
   is used.
60 61 62

   .. versionchanged:: 3.3 Added the *policy* keyword.

63
   The other public :class:`Generator` methods are:
64 65


66
   .. method:: flatten(msg, unixfrom=False, linesep=None)
67

68 69 70 71
      Print the textual representation of the message object structure rooted at
      *msg* to the output file specified when the :class:`Generator` instance
      was created.  Subparts are visited depth-first and the resulting text will
      be properly MIME encoded.
72

73 74 75 76 77
      Optional *unixfrom* is a flag that forces the printing of the envelope
      header delimiter before the first :rfc:`2822` header of the root message
      object.  If the root object has no envelope header, a standard one is
      crafted.  By default, this is set to ``False`` to inhibit the printing of
      the envelope delimiter.
78

79
      Note that for subparts, no envelope header is ever printed.
80

81
      Optional *linesep* specifies the line separator character used to
82
      terminate lines in the output.  If specified it overrides the value
83
      specified by the *msg*\'s or ``Generator``\'s ``policy``.
84

85 86 87 88 89 90 91
      Because strings cannot represent non-ASCII bytes, if the policy that
      applies when ``flatten`` is run has :attr:`~email.policy.Policy.cte_type`
      set to ``8bit``, ``Generator`` will operate as if it were set to
      ``7bit``.  This means that messages parsed with a Bytes parser that have
      a :mailheader:`Content-Transfer-Encoding` of ``8bit`` will be converted
      to a use a ``7bit`` Content-Transfer-Encoding.  Non-ASCII bytes in the
      headers will be :rfc:`2047` encoded with a charset of ``unknown-8bit``.
92

93
      .. versionchanged:: 3.2
94 95
         Added support for re-encoding ``8bit`` message bodies, and the
         *linesep* argument.
96

97
   .. method:: clone(fp)
98

99 100
      Return an independent clone of this :class:`Generator` instance with the
      exact same options.
101

102
   .. method:: write(s)
103

104 105 106
      Write the string *s* to the underlying file object, i.e. *outfp* passed to
      :class:`Generator`'s constructor.  This provides just enough file-like API
      for :class:`Generator` instances to be used in the :func:`print` function.
107

108 109 110 111
As a convenience, see the :class:`~email.message.Message` methods
:meth:`~email.message.Message.as_string` and ``str(aMessage)``, a.k.a.
:meth:`~email.message.Message.__str__`, which simplify the generation of a
formatted string representation of a message object.  For more detail, see
112 113
:mod:`email.message`.

114
.. class:: BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78, *, \
115
                          policy=None)
116

117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136
   The constructor for the :class:`BytesGenerator` class takes a binary
   :term:`file-like object` called *outfp* for an argument.  *outfp* must
   support a :meth:`write` method that accepts binary data.

   Optional *mangle_from_* is a flag that, when ``True``, puts a ``>``
   character in front of any line in the body that starts exactly as ``From``,
   i.e. ``From`` followed by a space at the beginning of the line.  This is the
   only guaranteed portable way to avoid having such lines be mistaken for a
   Unix mailbox format envelope header separator (see `WHY THE CONTENT-LENGTH
   FORMAT IS BAD <http://www.jwz.org/doc/content-length.html>`_ for details).
   *mangle_from_* defaults to ``True``, but you might want to set this to
   ``False`` if you are not writing Unix mailbox format files.

   Optional *maxheaderlen* specifies the longest length for a non-continued
   header.  When a header line is longer than *maxheaderlen* (in characters,
   with tabs expanded to 8 spaces), the header will be split as defined in the
   :class:`~email.header.Header` class.  Set to zero to disable header
   wrapping.  The default is 78, as recommended (but not required) by
   :rfc:`2822`.

137

138
   The *policy* keyword specifies a :mod:`~email.policy` object that controls a
139 140 141
   number of aspects of the generator's operation.  If no *policy* is specified,
   then the *policy* attached to the message object passed to :attr:`flatten`
   is used.
142 143 144

   .. versionchanged:: 3.3 Added the *policy* keyword.

145 146 147
   The other public :class:`BytesGenerator` methods are:


148
   .. method:: flatten(msg, unixfrom=False, linesep=None)
149 150 151 152

      Print the textual representation of the message object structure rooted
      at *msg* to the output file specified when the :class:`BytesGenerator`
      instance was created.  Subparts are visited depth-first and the resulting
153
      text will be properly MIME encoded.  If the :mod:`~email.policy` option
154
      :attr:`~email.policy.Policy.cte_type` is ``8bit`` (the default),
155 156
      then any bytes with the high bit set in the original parsed message that
      have not been modified will be copied faithfully to the output.  If
157 158 159 160
      ``cte_type`` is ``7bit``, the bytes will be converted as needed
      using an ASCII-compatible Content-Transfer-Encoding.  In particular,
      RFC-invalid non-ASCII bytes in headers will be encoded using the MIME
      ``unknown-8bit`` character set, thus rendering them RFC-compliant.
161 162 163

      .. XXX: There should be a complementary option that just does the RFC
         compliance transformation but leaves CTE 8bit parts alone.
164 165 166 167 168 169 170 171 172 173 174 175 176 177

      Messages parsed with a Bytes parser that have a
      :mailheader:`Content-Transfer-Encoding` of 8bit will be reconstructed
      as 8bit if they have not been modified.

      Optional *unixfrom* is a flag that forces the printing of the envelope
      header delimiter before the first :rfc:`2822` header of the root message
      object.  If the root object has no envelope header, a standard one is
      crafted.  By default, this is set to ``False`` to inhibit the printing of
      the envelope delimiter.

      Note that for subparts, no envelope header is ever printed.

      Optional *linesep* specifies the line separator character used to
178
      terminate lines in the output.  If specified it overrides the value
179
      specified by the ``Generator``\ or *msg*\ 's ``policy``.
180 181 182 183 184 185 186

   .. method:: clone(fp)

      Return an independent clone of this :class:`BytesGenerator` instance with
      the exact same options.

   .. method:: write(s)
187

188 189 190 191 192
      Write the string *s* to the underlying file object.  *s* is encoded using
      the ``ASCII`` codec and written to the *write* method of the  *outfp*
      *outfp* passed to the :class:`BytesGenerator`'s constructor.  This
      provides just enough file-like API for :class:`BytesGenerator` instances
      to be used in the :func:`print` function.
193 194 195

   .. versionadded:: 3.2

196 197 198 199 200 201
The :mod:`email.generator` module also provides a derived class, called
:class:`DecodedGenerator` which is like the :class:`Generator` base class,
except that non-\ :mimetype:`text` parts are substituted with a format string
representing the part.


202
.. class:: DecodedGenerator(outfp, mangle_from_=True, maxheaderlen=78, fmt=None)
203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227

   This class, derived from :class:`Generator` walks through all the subparts of a
   message.  If the subpart is of main type :mimetype:`text`, then it prints the
   decoded payload of the subpart. Optional *_mangle_from_* and *maxheaderlen* are
   as with the :class:`Generator` base class.

   If the subpart is not of main type :mimetype:`text`, optional *fmt* is a format
   string that is used instead of the message payload. *fmt* is expanded with the
   following keywords, ``%(keyword)s`` format:

   * ``type`` -- Full MIME type of the non-\ :mimetype:`text` part

   * ``maintype`` -- Main MIME type of the non-\ :mimetype:`text` part

   * ``subtype`` -- Sub-MIME type of the non-\ :mimetype:`text` part

   * ``filename`` -- Filename of the non-\ :mimetype:`text` part

   * ``description`` -- Description associated with the non-\ :mimetype:`text` part

   * ``encoding`` -- Content transfer encoding of the non-\ :mimetype:`text` part

   The default value for *fmt* is ``None``, meaning ::

      [Non-text (%(type)s) part of message omitted, filename %(filename)s]
228 229 230 231 232 233 234 235 236


.. rubric:: Footnotes

.. [#] This statement assumes that you use the appropriate setting for the
       ``unixfrom`` argument, and that you set maxheaderlen=0 (which will
       preserve whatever the input line lengths were).  It is also not strictly
       true, since in many cases runs of whitespace in headers are collapsed
       into single blanks.  The latter is a bug that will eventually be fixed.