email.util.rst 8.69 KB
Newer Older
1 2
:mod:`email.utils`: Miscellaneous utilities
-------------------------------------------
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

.. module:: email.utils
   :synopsis: Miscellaneous email package utilities.


There are several useful utilities provided in the :mod:`email.utils` module:


.. function:: quote(str)

   Return a new string with backslashes in *str* replaced by two backslashes, and
   double quotes replaced by backslash-double quote.


.. function:: unquote(str)

   Return a new string which is an *unquoted* version of *str*. If *str* ends and
   begins with double quotes, they are stripped off.  Likewise if *str* ends and
   begins with angle brackets, they are stripped off.


.. function:: parseaddr(address)

   Parse address -- which should be the value of some address-containing field such
   as :mailheader:`To` or :mailheader:`Cc` -- into its constituent *realname* and
   *email address* parts.  Returns a tuple of that information, unless the parse
   fails, in which case a 2-tuple of ``('', '')`` is returned.


32
.. function:: formataddr(pair, charset='utf-8')
33 34 35 36 37 38

   The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname,
   email_address)`` and returns the string value suitable for a :mailheader:`To` or
   :mailheader:`Cc` header.  If the first element of *pair* is false, then the
   second element is returned unmodified.

39 40 41 42 43
   Optional *charset* is the character set that will be used in the :rfc:`2047`
   encoding of the ``realname`` if the ``realname`` contains non-ASCII
   characters.  Can be an instance of :class:`str` or a
   :class:`~email.charset.Charset`.  Defaults to ``utf-8``.

44 45
   .. versionchanged:: 3.3
      Added the *charset* option.
46

47 48 49 50 51

.. function:: getaddresses(fieldvalues)

   This method returns a list of 2-tuples of the form returned by ``parseaddr()``.
   *fieldvalues* is a sequence of header field values as might be returned by
52 53
   :meth:`Message.get_all <email.message.Message.get_all>`.  Here's a simple
   example that gets all the recipients of a message::
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84

      from email.utils import getaddresses

      tos = msg.get_all('to', [])
      ccs = msg.get_all('cc', [])
      resent_tos = msg.get_all('resent-to', [])
      resent_ccs = msg.get_all('resent-cc', [])
      all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs)


.. function:: parsedate(date)

   Attempts to parse a date according to the rules in :rfc:`2822`. however, some
   mailers don't follow that format as specified, so :func:`parsedate` tries to
   guess correctly in such cases.  *date* is a string containing an :rfc:`2822`
   date, such as  ``"Mon, 20 Nov 1995 19:12:08 -0500"``.  If it succeeds in parsing
   the date, :func:`parsedate` returns a 9-tuple that can be passed directly to
   :func:`time.mktime`; otherwise ``None`` will be returned.  Note that indexes 6,
   7, and 8 of the result tuple are not usable.


.. function:: parsedate_tz(date)

   Performs the same function as :func:`parsedate`, but returns either ``None`` or
   a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
   :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC
   (which is the official term for Greenwich Mean Time) [#]_.  If the input string
   has no timezone, the last element of the tuple returned is ``None``.  Note that
   indexes 6, 7, and 8 of the result tuple are not usable.


85 86 87 88 89 90 91 92 93 94 95 96
.. function:: parsedate_to_datetime(date)

   The inverse of :func:`format_datetime`.  Performs the same function as
   :func:`parsedate`, but on success returns a :mod:`~datetime.datetime`.  If
   the input date has a timezone of ``-0000``, the ``datetime`` will be a naive
   ``datetime``, and if the date is conforming to the RFCs it will represent a
   time in UTC but with no indication of the actual source timezone of the
   message the date comes from.  If the input date has any other valid timezone
   offset, the ``datetime`` will be an aware ``datetime`` with the
   corresponding a :class:`~datetime.timezone` :class:`~datetime.tzinfo`.

   .. versionadded:: 3.3
97 98


99 100
.. function:: mktime_tz(tuple)

101 102 103
   Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC
   timestamp (seconds since the Epoch).  If the timezone item in the
   tuple is ``None``, assume local time.
104 105


106
.. function:: formatdate(timeval=None, localtime=False, usegmt=False)
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123

   Returns a date string as per :rfc:`2822`, e.g.::

      Fri, 09 Nov 2001 01:08:47 -0000

   Optional *timeval* if given is a floating point time value as accepted by
   :func:`time.gmtime` and :func:`time.localtime`, otherwise the current time is
   used.

   Optional *localtime* is a flag that when ``True``, interprets *timeval*, and
   returns a date relative to the local timezone instead of UTC, properly taking
   daylight savings time into account. The default is ``False`` meaning UTC is
   used.

   Optional *usegmt* is a flag that when ``True``, outputs a  date string with the
   timezone as an ascii string ``GMT``, rather than a numeric ``-0000``. This is
   needed for some protocols (such as HTTP). This only applies when *localtime* is
124
   ``False``.  The default is ``False``.
125 126


127 128 129 130 131 132 133 134 135 136 137 138 139 140
.. function:: format_datetime(dt, usegmt=False)

   Like ``formatdate``, but the input is a :mod:`datetime` instance.  If it is
   a naive datetime, it is assumed to be "UTC with no information about the
   source timezone", and the conventional ``-0000`` is used for the timezone.
   If it is an aware ``datetime``, then the numeric timezone offset is used.
   If it is an aware timezone with offset zero, then *usegmt* may be set to
   ``True``, in which case the string ``GMT`` is used instead of the numeric
   timezone offset.  This provides a way to generate standards conformant HTTP
   date headers.

   .. versionadded:: 3.3


141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156
.. function:: localtime(dt=None)

    Return local time as an aware datetime object.  If called without
    arguments, return current time.  Otherwise *dt* argument should be a
    :class:`~datetime.datetime` instance, and it is converted to the local time
    zone according to the system time zone database.  If *dt* is naive (that
    is, ``dt.tzinfo`` is ``None``), it is assumed to be in local time.  In this
    case, a positive or zero value for *isdst* causes ``localtime`` to presume
    initially that summer time (for example, Daylight Saving Time) is or is not
    (respectively) in effect for the specified time.  A negative value for
    *isdst* causes the ``localtime`` to attempt to divine whether summer time
    is in effect for the specified time.

    .. versionadded:: 3.3


157
.. function:: make_msgid(idstring=None, domain=None)
158 159

   Returns a string suitable for an :rfc:`2822`\ -compliant
160
   :mailheader:`Message-ID` header.  Optional *idstring* if given, is a string
161 162 163 164 165 166
   used to strengthen the uniqueness of the message id.  Optional *domain* if
   given provides the portion of the msgid after the '@'.  The default is the
   local hostname.  It is not normally necessary to override this default, but
   may be useful certain cases, such as a constructing distributed system that
   uses a consistent domain name across multiple hosts.

167 168
   .. versionchanged:: 3.2
      Added the *domain* keyword.
169 170 171 172 173 174 175


.. function:: decode_rfc2231(s)

   Decode the string *s* according to :rfc:`2231`.


176
.. function:: encode_rfc2231(s, charset=None, language=None)
177 178 179 180 181 182 183

   Encode the string *s* according to :rfc:`2231`.  Optional *charset* and
   *language*, if given is the character set name and language name to use.  If
   neither is given, *s* is returned as-is.  If *charset* is given but *language*
   is not, the string is encoded using the empty string for *language*.


184
.. function:: collapse_rfc2231_value(value, errors='replace', fallback_charset='us-ascii')
185 186

   When a header parameter is encoded in :rfc:`2231` format,
187 188
   :meth:`Message.get_param <email.message.Message.get_param>` may return a
   3-tuple containing the character set,
189
   language, and value.  :func:`collapse_rfc2231_value` turns this into a unicode
190
   string.  Optional *errors* is passed to the *errors* argument of :class:`str`'s
191
   :func:`~str.encode` method; it defaults to ``'replace'``.  Optional
192
   *fallback_charset* specifies the character set to use if the one in the
193
   :rfc:`2231` header is not known by Python; it defaults to ``'us-ascii'``.
194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209

   For convenience, if the *value* passed to :func:`collapse_rfc2231_value` is not
   a tuple, it should be a string and it is returned unquoted.


.. function:: decode_params(params)

   Decode parameters list according to :rfc:`2231`.  *params* is a sequence of
   2-tuples containing elements of the form ``(content-type, string-value)``.


.. rubric:: Footnotes

.. [#] Note that the sign of the timezone offset is the opposite of the sign of the
   ``time.timezone`` variable for the same timezone; the latter variable follows
   the POSIX standard while this module follows :rfc:`2822`.