libmailbox.tex 57.3 KB
Newer Older
Fred Drake's avatar
Fred Drake committed
1
\section{\module{mailbox} ---
2
          Manipulate mailboxes in various formats}
3

4 5 6 7
\declaremodule{}{mailbox}
\moduleauthor{Gregory K.~Johnson}{gkj@gregorykjohnson.com}
\sectionauthor{Gregory K.~Johnson}{gkj@gregorykjohnson.com}
\modulesynopsis{Manipulate mailboxes in various formats}
8

9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232
This module defines two classes, \class{Mailbox} and \class{Message}, for
accessing and manipulating on-disk mailboxes and the messages they contain.
\class{Mailbox} offers a dictionary-like mapping from keys to messages.
\class{Message} extends the \module{email.Message} module's \class{Message}
class with format-specific state and behavior. Supported mailbox formats are
Maildir, mbox, MH, Babyl, and MMDF.

\begin{seealso}
    \seemodule{email}{Represent and manipulate messages.}
\end{seealso}

\subsection{\class{Mailbox} objects}
\label{mailbox-objects}

\begin{classdesc*}{Mailbox}
A mailbox, which may be inspected and modified.
\end{classdesc*}

The \class{Mailbox} interface is dictionary-like, with small keys
corresponding to messages. Keys are issued by the \class{Mailbox} instance
with which they will be used and are only meaningful to that \class{Mailbox}
instance. A key continues to identify a message even if the corresponding
message is modified, such as by replacing it with another message. Messages may
be added to a \class{Mailbox} instance using the set-like method
\method{add()} and removed using a \code{del} statement or the set-like methods
\method{remove()} and \method{discard()}.

\class{Mailbox} interface semantics differ from dictionary semantics in some
noteworthy ways. Each time a message is requested, a new representation
(typically a \class{Message} instance) is generated, based upon the current
state of the mailbox. Similarly, when a message is added to a \class{Mailbox}
instance, the provided message representation's contents are copied. In neither
case is a reference to the message representation kept by the \class{Mailbox}
instance.

The default \class{Mailbox} iterator iterates over message representations, not
keys as the default dictionary iterator does. Moreover, modification of a
mailbox during iteration is safe and well-defined. Messages added to the
mailbox after an iterator is created will not be seen by the iterator. Messages
removed from the mailbox before the iterator yields them will be silently
skipped, though using a key from an iterator may result in a
\exception{KeyError} exception if the corresponding message is subsequently
removed.

\class{Mailbox} itself is intended to define an interface and to be inherited
from by format-specific subclasses but is not intended to be instantiated.
Instead, you should instantiate a subclass.

\class{Mailbox} instances have the following methods:

\begin{methoddesc}{add}{message}
Add \var{message} to the mailbox and return the key that has been assigned to
it.

Parameter \var{message} may be a \class{Message} instance, an
\class{email.Message.Message} instance, a string, or a file-like object (which
should be open in text mode). If \var{message} is an instance of the
appropriate format-specific \class{Message} subclass (e.g., if it's an
\class{mboxMessage} instance and this is an \class{mbox} instance), its
format-specific information is used. Otherwise, reasonable defaults for
format-specific information are used.
\end{methoddesc}

\begin{methoddesc}{remove}{key}
\methodline{__delitem__}{key}
\methodline{discard}{key}
Delete the message corresponding to \var{key} from the mailbox.

If no such message exists, a \exception{KeyError} exception is raised if the
method was called as \method{remove()} or \method{__delitem__()} but no
exception is raised if the method was called as \method{discard()}. The
behavior of \method{discard()} may be preferred if the underlying mailbox
format supports concurrent modification by other processes.
\end{methoddesc}

\begin{methoddesc}{__setitem__}{key, message}
Replace the message corresponding to \var{key} with \var{message}. Raise a
\exception{KeyError} exception if no message already corresponds to \var{key}.

As with \method{add()}, parameter \var{message} may be a \class{Message}
instance, an \class{email.Message.Message} instance, a string, or a file-like
object (which should be open in text mode). If \var{message} is an instance of
the appropriate format-specific \class{Message} subclass (e.g., if it's an
\class{mboxMessage} instance and this is an \class{mbox} instance), its
format-specific information is used. Otherwise, the format-specific information
of the message that currently corresponds to \var{key} is left unchanged. 
\end{methoddesc}

\begin{methoddesc}{iterkeys}{}
\methodline{keys}{}
Return an iterator over all keys if called as \method{iterkeys()} or return a
list of keys if called as \method{keys()}.
\end{methoddesc}

\begin{methoddesc}{itervalues}{}
\methodline{__iter__}{}
\methodline{values}{}
Return an iterator over representations of all messages if called as
\method{itervalues()} or \method{__iter__()} or return a list of such
representations if called as \method{values()}. The messages are represented as
instances of the appropriate format-specific \class{Message} subclass unless a
custom message factory was specified when the \class{Mailbox} instance was
initialized. \note{The behavior of \method{__iter__()} is unlike that of
dictionaries, which iterate over keys.}
\end{methoddesc}

\begin{methoddesc}{iteritems}{}
\methodline{items}{}
Return an iterator over (\var{key}, \var{message}) pairs, where \var{key} is a
key and \var{message} is a message representation, if called as
\method{iteritems()} or return a list of such pairs if called as
\method{items()}. The messages are represented as instances of the appropriate
format-specific \class{Message} subclass unless a custom message factory was
specified when the \class{Mailbox} instance was initialized.
\end{methoddesc}

\begin{methoddesc}{get}{key\optional{, default=None}}
\methodline{__getitem__}{key}
Return a representation of the message corresponding to \var{key}. If no such
message exists, \var{default} is returned if the method was called as
\method{get()} and a \exception{KeyError} exception is raised if the method was
called as \method{__getitem__()}. The message is represented as an instance of
the appropriate format-specific \class{Message} subclass unless a custom
message factory was specified when the \class{Mailbox} instance was
initialized.
\end{methoddesc}

\begin{methoddesc}{get_message}{key}
Return a representation of the message corresponding to \var{key} as an
instance of the appropriate format-specific \class{Message} subclass, or raise
a \exception{KeyError} exception if no such message exists.
\end{methoddesc}

\begin{methoddesc}{get_string}{key}
Return a string representation of the message corresponding to \var{key}, or
raise a \exception{KeyError} exception if no such message exists.
\end{methoddesc}

\begin{methoddesc}{get_file}{key}
Return a file-like representation of the message corresponding to \var{key},
or raise a \exception{KeyError} exception if no such message exists. The
file-like object behaves as if open in binary mode. This file should be closed
once it is no longer needed.

\note{Unlike other representations of messages, file-like representations are
not necessarily independent of the \class{Mailbox} instance that created them
or of the underlying mailbox. More specific documentation is provided by each
subclass.}
\end{methoddesc}

\begin{methoddesc}{has_key}{key}
\methodline{__contains__}{key}
Return \code{True} if \var{key} corresponds to a message, \code{False}
otherwise.
\end{methoddesc}

\begin{methoddesc}{__len__}{}
Return a count of messages in the mailbox.
\end{methoddesc}

\begin{methoddesc}{clear}{}
Delete all messages from the mailbox.
\end{methoddesc}

\begin{methoddesc}{pop}{key\optional{, default}}
Return a representation of the message corresponding to \var{key} and delete
the message. If no such message exists, return \var{default} if it was supplied
or else raise a \exception{KeyError} exception. The message is represented as
an instance of the appropriate format-specific \class{Message} subclass unless
a custom message factory was specified when the \class{Mailbox} instance was
initialized.
\end{methoddesc}

\begin{methoddesc}{popitem}{}
Return an arbitrary (\var{key}, \var{message}) pair, where \var{key} is a key
and \var{message} is a message representation, and delete the corresponding
message. If the mailbox is empty, raise a \exception{KeyError} exception. The
message is represented as an instance of the appropriate format-specific
\class{Message} subclass unless a custom message factory was specified when the
\class{Mailbox} instance was initialized.
\end{methoddesc}

\begin{methoddesc}{update}{arg}
Parameter \var{arg} should be a \var{key}-to-\var{message} mapping or an
iterable of (\var{key}, \var{message}) pairs. Updates the mailbox so that, for
each given \var{key} and \var{message}, the message corresponding to \var{key}
is set to \var{message} as if by using \method{__setitem__()}. As with
\method{__setitem__()}, each \var{key} must already correspond to a message in
the mailbox or else a \exception{KeyError} exception will be raised, so in
general it is incorrect for \var{arg} to be a \class{Mailbox} instance.
\note{Unlike with dictionaries, keyword arguments are not supported.}
\end{methoddesc}

\begin{methoddesc}{flush}{}
Write any pending changes to the filesystem. For some \class{Mailbox}
subclasses, changes are always written immediately and this method does
nothing.
\end{methoddesc}

\begin{methoddesc}{lock}{}
Acquire an exclusive advisory lock on the mailbox so that other processes know
not to modify it. An \exception{ExternalClashError} is raised if the lock is
not available. The particular locking mechanisms used depend upon the mailbox
format.
\end{methoddesc}

\begin{methoddesc}{unlock}{}
Release the lock on the mailbox, if any.
\end{methoddesc}

\begin{methoddesc}{close}{}
Flush the mailbox, unlock it if necessary, and close any open files. For some
\class{Mailbox} subclasses, this method does nothing.
\end{methoddesc}


\subsubsection{\class{Maildir}}
\label{mailbox-maildir}

\begin{classdesc}{Maildir}{dirname\optional{, factory=rfc822.Message\optional{,
create=True}}}
A subclass of \class{Mailbox} for mailboxes in Maildir format. Parameter
\var{factory} is a callable object that accepts a file-like message
Andrew M. Kuchling's avatar
Andrew M. Kuchling committed
233
representation (which behaves as if opened in binary mode) and returns a custom
234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358
representation. If \var{factory} is \code{None}, \class{MaildirMessage} is used
as the default message representation. If \var{create} is \code{True}, the
mailbox is created if it does not exist.

It is for historical reasons that \var{factory} defaults to
\class{rfc822.Message} and that \var{dirname} is named as such rather than
\var{path}. For a \class{Maildir} instance that behaves like instances of other
\class{Mailbox} subclasses, set \var{factory} to \code{None}.
\end{classdesc}

Maildir is a directory-based mailbox format invented for the qmail mail
transfer agent and now widely supported by other programs. Messages in a
Maildir mailbox are stored in separate files within a common directory
structure. This design allows Maildir mailboxes to be accessed and modified by
multiple unrelated programs without data corruption, so file locking is
unnecessary.

Maildir mailboxes contain three subdirectories, namely: \file{tmp}, \file{new},
and \file{cur}. Messages are created momentarily in the \file{tmp} subdirectory
and then moved to the \file{new} subdirectory to finalize delivery. A mail user
agent may subsequently move the message to the \file{cur} subdirectory and
store information about the state of the message in a special "info" section
appended to its file name.

Folders of the style introduced by the Courier mail transfer agent are also
supported. Any subdirectory of the main mailbox is considered a folder if
\character{.} is the first character in its name. Folder names are represented
by \class{Maildir} without the leading \character{.}. Each folder is itself a
Maildir mailbox but should not contain other folders. Instead, a logical
nesting is indicated using \character{.} to delimit levels, e.g.,
"Archived.2005.07".

\begin{notice}
The Maildir specification requires the use of a colon (\character{:}) in
certain message file names. However, some operating systems do not permit this
character in file names, If you wish to use a Maildir-like format on such an
operating system, you should specify another character to use instead. The
exclamation point (\character{!}) is a popular choice. For example:
\begin{verbatim}
import mailbox
mailbox.Maildir.colon = '!'
\end{verbatim}
The \member{colon} attribute may also be set on a per-instance basis.
\end{notice}

\class{Maildir} instances have all of the methods of \class{Mailbox} in
addition to the following:

\begin{methoddesc}{list_folders}{}
Return a list of the names of all folders.
\end{methoddesc}

\begin{methoddesc}{get_folder}{folder}
Return a \class{Maildir} instance representing the folder whose name is
\var{folder}. A \exception{NoSuchMailboxError} exception is raised if the
folder does not exist.
\end{methoddesc}

\begin{methoddesc}{add_folder}{folder}
Create a folder whose name is \var{folder} and return a \class{Maildir}
instance representing it.
\end{methoddesc}

\begin{methoddesc}{remove_folder}{folder}
Delete the folder whose name is \var{folder}. If the folder contains any
messages, a \exception{NotEmptyError} exception will be raised and the folder
will not be deleted.
\end{methoddesc}

\begin{methoddesc}{clean}{}
Delete temporary files from the mailbox that have not been accessed in the
last 36 hours. The Maildir specification says that mail-reading programs
should do this occasionally.
\end{methoddesc}

Some \class{Mailbox} methods implemented by \class{Maildir} deserve special
remarks:

\begin{methoddesc}{add}{message}
\methodline[Maildir]{__setitem__}{key, message}
\methodline[Maildir]{update}{arg}
\warning{These methods generate unique file names based upon the current
process ID. When using multiple threads, undetected name clashes may occur and
cause corruption of the mailbox unless threads are coordinated to avoid using
these methods to manipulate the same mailbox simultaneously.}
\end{methoddesc}

\begin{methoddesc}{flush}{}
All changes to Maildir mailboxes are immediately applied, so this method does
nothing.
\end{methoddesc}

\begin{methoddesc}{lock}{}
\methodline{unlock}{}
Maildir mailboxes do not support (or require) locking, so these methods do
nothing. 
\end{methoddesc}

\begin{methoddesc}{close}{}
\class{Maildir} instances do not keep any open files and the underlying
mailboxes do not support locking, so this method does nothing.
\end{methoddesc}

\begin{methoddesc}{get_file}{key}
Depending upon the host platform, it may not be possible to modify or remove
the underlying message while the returned file remains open.
\end{methoddesc}

\begin{seealso}
    \seelink{http://www.qmail.org/man/man5/maildir.html}{maildir man page from
    qmail}{The original specification of the format.}
    \seelink{http://cr.yp.to/proto/maildir.html}{Using maildir format}{Notes
    on Maildir by its inventor. Includes an updated name-creation scheme and
    details on "info" semantics.}
    \seelink{http://www.courier-mta.org/?maildir.html}{maildir man page from
    Courier}{Another specification of the format. Describes a common extension
    for supporting folders.}
\end{seealso}

\subsubsection{\class{mbox}}
\label{mailbox-mbox}

\begin{classdesc}{mbox}{path\optional{, factory=None\optional{, create=True}}}
A subclass of \class{Mailbox} for mailboxes in mbox format. Parameter
\var{factory} is a callable object that accepts a file-like message
Andrew M. Kuchling's avatar
Andrew M. Kuchling committed
359
representation (which behaves as if opened in binary mode) and returns a custom
360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411
representation. If \var{factory} is \code{None}, \class{mboxMessage} is used as
the default message representation. If \var{create} is \code{True}, the mailbox
is created if it does not exist.
\end{classdesc}

The mbox format is the classic format for storing mail on \UNIX{} systems. All
messages in an mbox mailbox are stored in a single file with the beginning of
each message indicated by a line whose first five characters are "From~".

Several variations of the mbox format exist to address perceived shortcomings
in the original. In the interest of compatibility, \class{mbox} implements the
original format, which is sometimes referred to as \dfn{mboxo}. This means that
the \mailheader{Content-Length} header, if present, is ignored and that any
occurrences of "From~" at the beginning of a line in a message body are
transformed to ">From~" when storing the message, although occurences of
">From~" are not transformed to "From~" when reading the message.

Some \class{Mailbox} methods implemented by \class{mbox} deserve special
remarks:

\begin{methoddesc}{get_file}{key}
Using the file after calling \method{flush()} or \method{close()} on the
\class{mbox} instance may yield unpredictable results or raise an exception.
\end{methoddesc}

\begin{methoddesc}{lock}{}
\methodline{unlock}{}
Three locking mechanisms are used---dot locking and, if available, the
\cfunction{flock()} and \cfunction{lockf()} system calls.
\end{methoddesc}

\begin{seealso}
    \seelink{http://www.qmail.org/man/man5/mbox.html}{mbox man page from
    qmail}{A specification of the format and its variations.}
    \seelink{http://www.tin.org/bin/man.cgi?section=5\&topic=mbox}{mbox man
    page from tin}{Another specification of the format, with details on
    locking.}
    \seelink{http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html}
    {Configuring Netscape Mail on \UNIX{}: Why The Content-Length Format is
    Bad}{An argument for using the original mbox format rather than a
    variation.}
    \seelink{http://homepages.tesco.net./\tilde{}J.deBoynePollard/FGA/mail-mbox-formats.html}
    {"mbox" is a family of several mutually incompatible mailbox formats}{A
    history of mbox variations.}
\end{seealso}

\subsubsection{\class{MH}}
\label{mailbox-mh}

\begin{classdesc}{MH}{path\optional{, factory=None\optional{, create=True}}}
A subclass of \class{Mailbox} for mailboxes in MH format. Parameter
\var{factory} is a callable object that accepts a file-like message
Andrew M. Kuchling's avatar
Andrew M. Kuchling committed
412
representation (which behaves as if opened in binary mode) and returns a custom
413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518
representation. If \var{factory} is \code{None}, \class{MHMessage} is used as
the default message representation. If \var{create} is \code{True}, the mailbox
is created if it does not exist.
\end{classdesc}

MH is a directory-based mailbox format invented for the MH Message Handling
System, a mail user agent. Each message in an MH mailbox resides in its own
file. An MH mailbox may contain other MH mailboxes (called \dfn{folders}) in
addition to messages. Folders may be nested indefinitely. MH mailboxes also
support \dfn{sequences}, which are named lists used to logically group messages
without moving them to sub-folders. Sequences are defined in a file called
\file{.mh_sequences} in each folder.

The \class{MH} class manipulates MH mailboxes, but it does not attempt to
emulate all of \program{mh}'s behaviors. In particular, it does not modify and
is not affected by the \file{context} or \file{.mh_profile} files that are used
by \program{mh} to store its state and configuration.

\class{MH} instances have all of the methods of \class{Mailbox} in addition to
the following:

\begin{methoddesc}{list_folders}{}
Return a list of the names of all folders.
\end{methoddesc}

\begin{methoddesc}{get_folder}{folder}
Return an \class{MH} instance representing the folder whose name is
\var{folder}. A \exception{NoSuchMailboxError} exception is raised if the
folder does not exist.
\end{methoddesc}

\begin{methoddesc}{add_folder}{folder}
Create a folder whose name is \var{folder} and return an \class{MH} instance
representing it.
\end{methoddesc}

\begin{methoddesc}{remove_folder}{folder}
Delete the folder whose name is \var{folder}. If the folder contains any
messages, a \exception{NotEmptyError} exception will be raised and the folder
will not be deleted.
\end{methoddesc}

\begin{methoddesc}{get_sequences}{}
Return a dictionary of sequence names mapped to key lists. If there are no
sequences, the empty dictionary is returned.
\end{methoddesc}

\begin{methoddesc}{set_sequences}{sequences}
Re-define the sequences that exist in the mailbox based upon \var{sequences}, a
dictionary of names mapped to key lists, like returned by
\method{get_sequences()}.
\end{methoddesc}

\begin{methoddesc}{pack}{}
Rename messages in the mailbox as necessary to eliminate gaps in numbering.
Entries in the sequences list are updated correspondingly. \note{Already-issued
keys are invalidated by this operation and should not be subsequently used.}
\end{methoddesc}

Some \class{Mailbox} methods implemented by \class{MH} deserve special remarks:

\begin{methoddesc}{remove}{key}
\methodline{__delitem__}{key}
\methodline{discard}{key}
These methods immediately delete the message. The MH convention of marking a
message for deletion by prepending a comma to its name is not used.
\end{methoddesc}

\begin{methoddesc}{lock}{}
\methodline{unlock}{}
Three locking mechanisms are used---dot locking and, if available, the
\cfunction{flock()} and \cfunction{lockf()} system calls. For MH mailboxes,
locking the mailbox means locking the \file{.mh_sequences} file and, only for
the duration of any operations that affect them, locking individual message
files.
\end{methoddesc}

\begin{methoddesc}{get_file}{key}
Depending upon the host platform, it may not be possible to remove the
underlying message while the returned file remains open.
\end{methoddesc}

\begin{methoddesc}{flush}{}
All changes to MH mailboxes are immediately applied, so this method does
nothing.
\end{methoddesc}

\begin{methoddesc}{close}{}
\class{MH} instances do not keep any open files, so this method is equivelant
to \method{unlock()}.
\end{methoddesc}

\begin{seealso}
\seelink{http://www.nongnu.org/nmh/}{nmh - Message Handling System}{Home page
of \program{nmh}, an updated version of the original \program{mh}.}
\seelink{http://www.ics.uci.edu/\tilde{}mh/book/}{MH \& nmh: Email for Users \&
Programmers}{A GPL-licensed book on \program{mh} and \program{nmh}, with some
information on the mailbox format.}
\end{seealso}

\subsubsection{\class{Babyl}}
\label{mailbox-babyl}

\begin{classdesc}{Babyl}{path\optional{, factory=None\optional{, create=True}}}
A subclass of \class{Mailbox} for mailboxes in Babyl format. Parameter
\var{factory} is a callable object that accepts a file-like message
Andrew M. Kuchling's avatar
Andrew M. Kuchling committed
519
representation (which behaves as if opened in binary mode) and returns a custom
520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581
representation. If \var{factory} is \code{None}, \class{BabylMessage} is used
as the default message representation. If \var{create} is \code{True}, the
mailbox is created if it does not exist.
\end{classdesc}

Babyl is a single-file mailbox format used by the Rmail mail user agent
included with Emacs. The beginning of a message is indicated by a line
containing the two characters Control-Underscore
(\character{\textbackslash037}) and Control-L (\character{\textbackslash014}).
The end of a message is indicated by the start of the next message or, in the
case of the last message, a line containing a Control-Underscore
(\character{\textbackslash037}) character.

Messages in a Babyl mailbox have two sets of headers, original headers and
so-called visible headers. Visible headers are typically a subset of the
original headers that have been reformatted or abridged to be more attractive.
Each message in a Babyl mailbox also has an accompanying list of \dfn{labels},
or short strings that record extra information about the message, and a list of
all user-defined labels found in the mailbox is kept in the Babyl options
section.

\class{Babyl} instances have all of the methods of \class{Mailbox} in addition
to the following:

\begin{methoddesc}{get_labels}{}
Return a list of the names of all user-defined labels used in the mailbox.
\note{The actual messages are inspected to determine which labels exist in the
mailbox rather than consulting the list of labels in the Babyl options section,
but the Babyl section is updated whenever the mailbox is modified.}
\end{methoddesc}

Some \class{Mailbox} methods implemented by \class{Babyl} deserve special
remarks:

\begin{methoddesc}{get_file}{key}
In Babyl mailboxes, the headers of a message are not stored contiguously with
the body of the message. To generate a file-like representation, the headers
and body are copied together into a \class{StringIO} instance (from the
\module{StringIO} module), which has an API identical to that of a file. As a
result, the file-like object is truly independent of the underlying mailbox but
does not save memory compared to a string representation.
\end{methoddesc}

\begin{methoddesc}{lock}{}
\methodline{unlock}{}
Three locking mechanisms are used---dot locking and, if available, the
\cfunction{flock()} and \cfunction{lockf()} system calls.
\end{methoddesc}

\begin{seealso}
\seelink{http://quimby.gnus.org/notes/BABYL}{Format of Version 5 Babyl Files}{A
specification of the Babyl format.}
\seelink{http://www.gnu.org/software/emacs/manual/html_node/Rmail.html}{Reading
Mail with Rmail}{The Rmail manual, with some information on Babyl semantics.}
\end{seealso}

\subsubsection{\class{MMDF}}
\label{mailbox-mmdf}

\begin{classdesc}{MMDF}{path\optional{, factory=None\optional{, create=True}}}
A subclass of \class{Mailbox} for mailboxes in MMDF format. Parameter
\var{factory} is a callable object that accepts a file-like message
Andrew M. Kuchling's avatar
Andrew M. Kuchling committed
582
representation (which behaves as if opened in binary mode) and returns a custom
583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250
representation. If \var{factory} is \code{None}, \class{MMDFMessage} is used as
the default message representation. If \var{create} is \code{True}, the mailbox
is created if it does not exist.
\end{classdesc}

MMDF is a single-file mailbox format invented for the Multichannel Memorandum
Distribution Facility, a mail transfer agent. Each message is in the same form
as an mbox message but is bracketed before and after by lines containing four
Control-A (\character{\textbackslash001}) characters. As with the mbox format,
the beginning of each message is indicated by a line whose first five
characters are "From~", but additional occurrences of "From~" are not
transformed to ">From~" when storing messages because the extra message
separator lines prevent mistaking such occurrences for the starts of subsequent
messages.

Some \class{Mailbox} methods implemented by \class{MMDF} deserve special
remarks:

\begin{methoddesc}{get_file}{key}
Using the file after calling \method{flush()} or \method{close()} on the
\class{MMDF} instance may yield unpredictable results or raise an exception.
\end{methoddesc}

\begin{methoddesc}{lock}{}
\methodline{unlock}{}
Three locking mechanisms are used---dot locking and, if available, the
\cfunction{flock()} and \cfunction{lockf()} system calls.
\end{methoddesc}

\begin{seealso}
\seelink{http://www.tin.org/bin/man.cgi?section=5\&topic=mmdf}{mmdf man page
from tin}{A specification of MMDF format from the documentation of tin, a
newsreader.}
\seelink{http://en.wikipedia.org/wiki/MMDF}{MMDF}{A Wikipedia article
describing the Multichannel Memorandum Distribution Facility.}
\end{seealso}

\subsection{\class{Message} objects}
\label{mailbox-message-objects}

\begin{classdesc}{Message}{\optional{message}}
A subclass of the \module{email.Message} module's \class{Message}. Subclasses
of \class{mailbox.Message} add mailbox-format-specific state and behavior.

If \var{message} is omitted, the new instance is created in a default, empty
state. If \var{message} is an \class{email.Message.Message} instance, its
contents are copied; furthermore, any format-specific information is converted
insofar as possible if \var{message} is a \class{Message} instance. If
\var{message} is a string or a file, it should contain an \rfc{2822}-compliant
message, which is read and parsed.
\end{classdesc}

The format-specific state and behaviors offered by subclasses vary, but in
general it is only the properties that are not specific to a particular mailbox
that are supported (although presumably the properties are specific to a
particular mailbox format). For example, file offsets for single-file mailbox
formats and file names for directory-based mailbox formats are not retained,
because they are only applicable to the original mailbox. But state such as
whether a message has been read by the user or marked as important is retained,
because it applies to the message itself.

There is no requirement that \class{Message} instances be used to represent
messages retrieved using \class{Mailbox} instances. In some situations, the
time and memory required to generate \class{Message} representations might not
not acceptable. For such situations, \class{Mailbox} instances also offer
string and file-like representations, and a custom message factory may be
specified when a \class{Mailbox} instance is initialized. 

\subsubsection{\class{MaildirMessage}}
\label{mailbox-maildirmessage}

\begin{classdesc}{MaildirMessage}{\optional{message}}
A message with Maildir-specific behaviors. Parameter \var{message}
has the same meaning as with the \class{Message} constructor.
\end{classdesc}

Typically, a mail user agent application moves all of the messages in the
\file{new} subdirectory to the \file{cur} subdirectory after the first time the
user opens and closes the mailbox, recording that the messages are old whether
or not they've actually been read. Each message in \file{cur} has an "info"
section added to its file name to store information about its state. (Some mail
readers may also add an "info" section to messages in \file{new}.) The "info"
section may take one of two forms: it may contain "2," followed by a list of
standardized flags (e.g., "2,FR") or it may contain "1," followed by so-called
experimental information. Standard flags for Maildir messages are as follows:

\begin{tableiii}{l|l|l}{textrm}{Flag}{Meaning}{Explanation}
\lineiii{D}{Draft}{Under composition}
\lineiii{F}{Flagged}{Marked as important}
\lineiii{P}{Passed}{Forwarded, resent, or bounced}
\lineiii{R}{Replied}{Replied to}
\lineiii{S}{Seen}{Read}
\lineiii{T}{Trashed}{Marked for subsequent deletion}
\end{tableiii}

\class{MaildirMessage} instances offer the following methods:

\begin{methoddesc}{get_subdir}{}
Return either "new" (if the message should be stored in the \file{new}
subdirectory) or "cur" (if the message should be stored in the \file{cur}
subdirectory). \note{A message is typically moved from \file{new} to \file{cur}
after its mailbox has been accessed, whether or not the message is has been
read. A message \code{msg} has been read if \code{"S" not in msg.get_flags()}
is \code{True}.}
\end{methoddesc}

\begin{methoddesc}{set_subdir}{subdir}
Set the subdirectory the message should be stored in. Parameter \var{subdir}
must be either "new" or "cur".
\end{methoddesc}

\begin{methoddesc}{get_flags}{}
Return a string specifying the flags that are currently set. If the message
complies with the standard Maildir format, the result is the concatenation in
alphabetical order of zero or one occurrence of each of \character{D},
\character{F}, \character{P}, \character{R}, \character{S}, and \character{T}.
The empty string is returned if no flags are set or if "info" contains
experimental semantics.
\end{methoddesc}

\begin{methoddesc}{set_flags}{flags}
Set the flags specified by \var{flags} and unset all others.
\end{methoddesc}

\begin{methoddesc}{add_flag}{flag}
Set the flag(s) specified by \var{flag} without changing other flags. To add
more than one flag at a time, \var{flag} may be a string of more than one
character. The current "info" is overwritten whether or not it contains
experimental information rather than
flags.
\end{methoddesc}

\begin{methoddesc}{remove_flag}{flag}
Unset the flag(s) specified by \var{flag} without changing other flags. To
remove more than one flag at a time, \var{flag} maybe a string of more than one
character. If "info" contains experimental information rather than flags, the
current "info" is not modified.
\end{methoddesc}

\begin{methoddesc}{get_date}{}
Return the delivery date of the message as a floating-point number representing
seconds since the epoch.
\end{methoddesc}

\begin{methoddesc}{set_date}{date}
Set the delivery date of the message to \var{date}, a floating-point number
representing seconds since the epoch.
\end{methoddesc}

\begin{methoddesc}{get_info}{}
Return a string containing the "info" for a message. This is useful for
accessing and modifying "info" that is experimental (i.e., not a list of
flags).
\end{methoddesc}

\begin{methoddesc}{set_info}{info}
Set "info" to \var{info}, which should be a string.
\end{methoddesc}

When a \class{MaildirMessage} instance is created based upon an
\class{mboxMessage} or \class{MMDFMessage} instance, the \mailheader{Status}
and \mailheader{X-Status} headers are omitted and the following conversions
take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{mboxMessage} or \class{MMDFMessage} state}
\lineii{"cur" subdirectory}{O flag}
\lineii{F flag}{F flag}
\lineii{R flag}{A flag}
\lineii{S flag}{R flag}
\lineii{T flag}{D flag}
\end{tableii}

When a \class{MaildirMessage} instance is created based upon an
\class{MHMessage} instance, the following conversions take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{MHMessage} state}
\lineii{"cur" subdirectory}{"unseen" sequence}
\lineii{"cur" subdirectory and S flag}{no "unseen" sequence}
\lineii{F flag}{"flagged" sequence}
\lineii{R flag}{"replied" sequence}
\end{tableii}

When a \class{MaildirMessage} instance is created based upon a
\class{BabylMessage} instance, the following conversions take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{BabylMessage} state}
\lineii{"cur" subdirectory}{"unseen" label}
\lineii{"cur" subdirectory and S flag}{no "unseen" label}
\lineii{P flag}{"forwarded" or "resent" label}
\lineii{R flag}{"answered" label}
\lineii{T flag}{"deleted" label}
\end{tableii}

\subsubsection{\class{mboxMessage}}
\label{mailbox-mboxmessage}

\begin{classdesc}{mboxMessage}{\optional{message}}
A message with mbox-specific behaviors. Parameter \var{message} has the same
meaning as with the \class{Message} constructor.
\end{classdesc}

Messages in an mbox mailbox are stored together in a single file. The sender's
envelope address and the time of delivery are typically stored in a line
beginning with "From~" that is used to indicate the start of a message, though
there is considerable variation in the exact format of this data among mbox
implementations. Flags that indicate the state of the message, such as whether
it has been read or marked as important, are typically stored in
\mailheader{Status} and \mailheader{X-Status} headers.

Conventional flags for mbox messages are as follows:

\begin{tableiii}{l|l|l}{textrm}{Flag}{Meaning}{Explanation}
\lineiii{R}{Read}{Read}
\lineiii{O}{Old}{Previously detected by MUA}
\lineiii{D}{Deleted}{Marked for subsequent deletion}
\lineiii{F}{Flagged}{Marked as important}
\lineiii{A}{Answered}{Replied to}
\end{tableiii}

The "R" and "O" flags are stored in the \mailheader{Status} header, and the
"D", "F", and "A" flags are stored in the \mailheader{X-Status} header. The
flags and headers typically appear in the order mentioned.

\class{mboxMessage} instances offer the following methods:

\begin{methoddesc}{get_from}{}
Return a string representing the "From~" line that marks the start of the
message in an mbox mailbox. The leading "From~" and the trailing newline are
excluded.
\end{methoddesc}

\begin{methoddesc}{set_from}{from_\optional{, time_=None}}
Set the "From~" line to \var{from_}, which should be specified without a
leading "From~" or trailing newline. For convenience, \var{time_} may be
specified and will be formatted appropriately and appended to \var{from_}. If
\var{time_} is specified, it should be a \class{struct_time} instance, a tuple
suitable for passing to \method{time.strftime()}, or \code{True} (to use
\method{time.gmtime()}).
\end{methoddesc}

\begin{methoddesc}{get_flags}{}
Return a string specifying the flags that are currently set. If the message
complies with the conventional format, the result is the concatenation in the
following order of zero or one occurrence of each of \character{R},
\character{O}, \character{D}, \character{F}, and \character{A}.
\end{methoddesc}

\begin{methoddesc}{set_flags}{flags}
Set the flags specified by \var{flags} and unset all others. Parameter
\var{flags} should be the concatenation in any order of zero or more
occurrences of each of \character{R}, \character{O}, \character{D},
\character{F}, and \character{A}.
\end{methoddesc}

\begin{methoddesc}{add_flag}{flag}
Set the flag(s) specified by \var{flag} without changing other flags. To add
more than one flag at a time, \var{flag} may be a string of more than one
character.
\end{methoddesc}

\begin{methoddesc}{remove_flag}{flag}
Unset the flag(s) specified by \var{flag} without changing other flags. To
remove more than one flag at a time, \var{flag} maybe a string of more than one
character.
\end{methoddesc}

When an \class{mboxMessage} instance is created based upon a
\class{MaildirMessage} instance, a "From~" line is generated based upon the
\class{MaildirMessage} instance's delivery date, and the following conversions
take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{MaildirMessage} state}
\lineii{R flag}{S flag}
\lineii{O flag}{"cur" subdirectory}
\lineii{D flag}{T flag}
\lineii{F flag}{F flag}
\lineii{A flag}{R flag}
\end{tableii}

When an \class{mboxMessage} instance is created based upon an \class{MHMessage}
instance, the following conversions take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{MHMessage} state}
\lineii{R flag and O flag}{no "unseen" sequence}
\lineii{O flag}{"unseen" sequence}
\lineii{F flag}{"flagged" sequence}
\lineii{A flag}{"replied" sequence}
\end{tableii}

When an \class{mboxMessage} instance is created based upon a
\class{BabylMessage} instance, the following conversions take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{BabylMessage} state}
\lineii{R flag and O flag}{no "unseen" label}
\lineii{O flag}{"unseen" label}
\lineii{D flag}{"deleted" label}
\lineii{A flag}{"answered" label}
\end{tableii}

When a \class{Message} instance is created based upon an \class{MMDFMessage}
instance, the "From~" line is copied and all flags directly correspond:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{MMDFMessage} state}
\lineii{R flag}{R flag}
\lineii{O flag}{O flag}
\lineii{D flag}{D flag}
\lineii{F flag}{F flag}
\lineii{A flag}{A flag}
\end{tableii}

\subsubsection{\class{MHMessage}}
\label{mailbox-mhmessage}

\begin{classdesc}{MHMessage}{\optional{message}}
A message with MH-specific behaviors. Parameter \var{message} has the same
meaning as with the \class{Message} constructor.
\end{classdesc}

MH messages do not support marks or flags in the traditional sense, but they do
support sequences, which are logical groupings of arbitrary messages. Some mail
reading programs (although not the standard \program{mh} and \program{nmh}) use
sequences in much the same way flags are used with other formats, as follows:

\begin{tableii}{l|l}{textrm}{Sequence}{Explanation}
\lineii{unseen}{Not read, but previously detected by MUA}
\lineii{replied}{Replied to}
\lineii{flagged}{Marked as important}
\end{tableii}

\class{MHMessage} instances offer the following methods:

\begin{methoddesc}{get_sequences}{}
Return a list of the names of sequences that include this message.
\end{methoddesc}

\begin{methoddesc}{set_sequences}{sequences}
Set the list of sequences that include this message.
\end{methoddesc}

\begin{methoddesc}{add_sequence}{sequence}
Add \var{sequence} to the list of sequences that include this message.
\end{methoddesc}

\begin{methoddesc}{remove_sequence}{sequence}
Remove \var{sequence} from the list of sequences that include this message.
\end{methoddesc}

When an \class{MHMessage} instance is created based upon a
\class{MaildirMessage} instance, the following conversions take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{MaildirMessage} state}
\lineii{"unseen" sequence}{no S flag}
\lineii{"replied" sequence}{R flag}
\lineii{"flagged" sequence}{F flag}
\end{tableii}

When an \class{MHMessage} instance is created based upon an \class{mboxMessage}
or \class{MMDFMessage} instance, the \mailheader{Status} and
\mailheader{X-Status} headers are omitted and the following conversions take
place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{mboxMessage} or \class{MMDFMessage} state}
\lineii{"unseen" sequence}{no R flag}
\lineii{"replied" sequence}{A flag}
\lineii{"flagged" sequence}{F flag}
\end{tableii}

When an \class{MHMessage} instance is created based upon a \class{BabylMessage}
instance, the following conversions take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{BabylMessage} state}
\lineii{"unseen" sequence}{"unseen" label}
\lineii{"replied" sequence}{"answered" label}
\end{tableii}

\subsubsection{\class{BabylMessage}}
\label{mailbox-babylmessage}

\begin{classdesc}{BabylMessage}{\optional{message}}
A message with Babyl-specific behaviors. Parameter \var{message} has the same
meaning as with the \class{Message} constructor.
\end{classdesc}

Certain message labels, called \dfn{attributes}, are defined by convention to
have special meanings. The attributes are as follows:

\begin{tableii}{l|l}{textrm}{Label}{Explanation}
\lineii{unseen}{Not read, but previously detected by MUA}
\lineii{deleted}{Marked for subsequent deletion}
\lineii{filed}{Copied to another file or mailbox}
\lineii{answered}{Replied to}
\lineii{forwarded}{Forwarded}
\lineii{edited}{Modified by the user}
\lineii{resent}{Resent}
\end{tableii}

By default, Rmail displays only
visible headers. The \class{BabylMessage} class, though, uses the original
headers because they are more complete. Visible headers may be accessed
explicitly if desired.

\class{BabylMessage} instances offer the following methods:

\begin{methoddesc}{get_labels}{}
Return a list of labels on the message.
\end{methoddesc}

\begin{methoddesc}{set_labels}{labels}
Set the list of labels on the message to \var{labels}.
\end{methoddesc}

\begin{methoddesc}{add_label}{label}
Add \var{label} to the list of labels on the message.
\end{methoddesc}

\begin{methoddesc}{remove_label}{label}
Remove \var{label} from the list of labels on the message.
\end{methoddesc}

\begin{methoddesc}{get_visible}{}
Return an \class{Message} instance whose headers are the message's visible
headers and whose body is empty.
\end{methoddesc}

\begin{methoddesc}{set_visible}{visible}
Set the message's visible headers to be the same as the headers in
\var{message}. Parameter \var{visible} should be a \class{Message} instance, an
\class{email.Message.Message} instance, a string, or a file-like object (which
should be open in text mode).
\end{methoddesc}

\begin{methoddesc}{update_visible}{}
When a \class{BabylMessage} instance's original headers are modified, the
visible headers are not automatically modified to correspond. This method
updates the visible headers as follows: each visible header with a
corresponding original header is set to the value of the original header, each
visible header without a corresponding original header is removed, and any of
\mailheader{Date}, \mailheader{From}, \mailheader{Reply-To}, \mailheader{To},
\mailheader{CC}, and \mailheader{Subject} that are present in the original
headers but not the visible headers are added to the visible headers.
\end{methoddesc}

When a \class{BabylMessage} instance is created based upon a
\class{MaildirMessage} instance, the following conversions take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{MaildirMessage} state}
\lineii{"unseen" label}{no S flag}
\lineii{"deleted" label}{T flag}
\lineii{"answered" label}{R flag}
\lineii{"forwarded" label}{P flag}
\end{tableii}

When a \class{BabylMessage} instance is created based upon an
\class{mboxMessage} or \class{MMDFMessage} instance, the \mailheader{Status}
and \mailheader{X-Status} headers are omitted and the following conversions
take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{mboxMessage} or \class{MMDFMessage} state}
\lineii{"unseen" label}{no R flag}
\lineii{"deleted" label}{D flag}
\lineii{"answered" label}{A flag}
\end{tableii}

When a \class{BabylMessage} instance is created based upon an \class{MHMessage}
instance, the following conversions take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{MHMessage} state}
\lineii{"unseen" label}{"unseen" sequence}
\lineii{"answered" label}{"replied" sequence}
\end{tableii}

\subsubsection{\class{MMDFMessage}}
\label{mailbox-mmdfmessage}

\begin{classdesc}{MMDFMessage}{\optional{message}}
A message with MMDF-specific behaviors. Parameter \var{message} has the same
meaning as with the \class{Message} constructor.
\end{classdesc}

As with message in an mbox mailbox, MMDF messages are stored with the sender's
address and the delivery date in an initial line beginning with "From ".
Likewise, flags that indicate the state of the message are typically stored in
\mailheader{Status} and \mailheader{X-Status} headers.

Conventional flags for MMDF messages are identical to those of mbox message and
are as follows:

\begin{tableiii}{l|l|l}{textrm}{Flag}{Meaning}{Explanation}
\lineiii{R}{Read}{Read}
\lineiii{O}{Old}{Previously detected by MUA}
\lineiii{D}{Deleted}{Marked for subsequent deletion}
\lineiii{F}{Flagged}{Marked as important}
\lineiii{A}{Answered}{Replied to}
\end{tableiii}

The "R" and "O" flags are stored in the \mailheader{Status} header, and the
"D", "F", and "A" flags are stored in the \mailheader{X-Status} header. The
flags and headers typically appear in the order mentioned.

\class{MMDFMessage} instances offer the following methods, which are identical
to those offered by \class{mboxMessage}:

\begin{methoddesc}{get_from}{}
Return a string representing the "From~" line that marks the start of the
message in an mbox mailbox. The leading "From~" and the trailing newline are
excluded.
\end{methoddesc}

\begin{methoddesc}{set_from}{from_\optional{, time_=None}}
Set the "From~" line to \var{from_}, which should be specified without a
leading "From~" or trailing newline. For convenience, \var{time_} may be
specified and will be formatted appropriately and appended to \var{from_}. If
\var{time_} is specified, it should be a \class{struct_time} instance, a tuple
suitable for passing to \method{time.strftime()}, or \code{True} (to use
\method{time.gmtime()}).
\end{methoddesc}

\begin{methoddesc}{get_flags}{}
Return a string specifying the flags that are currently set. If the message
complies with the conventional format, the result is the concatenation in the
following order of zero or one occurrence of each of \character{R},
\character{O}, \character{D}, \character{F}, and \character{A}.
\end{methoddesc}

\begin{methoddesc}{set_flags}{flags}
Set the flags specified by \var{flags} and unset all others. Parameter
\var{flags} should be the concatenation in any order of zero or more
occurrences of each of \character{R}, \character{O}, \character{D},
\character{F}, and \character{A}.
\end{methoddesc}

\begin{methoddesc}{add_flag}{flag}
Set the flag(s) specified by \var{flag} without changing other flags. To add
more than one flag at a time, \var{flag} may be a string of more than one
character.
\end{methoddesc}

\begin{methoddesc}{remove_flag}{flag}
Unset the flag(s) specified by \var{flag} without changing other flags. To
remove more than one flag at a time, \var{flag} maybe a string of more than one
character.
\end{methoddesc}

When an \class{MMDFMessage} instance is created based upon a
\class{MaildirMessage} instance, a "From~" line is generated based upon the
\class{MaildirMessage} instance's delivery date, and the following conversions
take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{MaildirMessage} state}
\lineii{R flag}{S flag}
\lineii{O flag}{"cur" subdirectory}
\lineii{D flag}{T flag}
\lineii{F flag}{F flag}
\lineii{A flag}{R flag}
\end{tableii}

When an \class{MMDFMessage} instance is created based upon an \class{MHMessage}
instance, the following conversions take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{MHMessage} state}
\lineii{R flag and O flag}{no "unseen" sequence}
\lineii{O flag}{"unseen" sequence}
\lineii{F flag}{"flagged" sequence}
\lineii{A flag}{"replied" sequence}
\end{tableii}

When an \class{MMDFMessage} instance is created based upon a
\class{BabylMessage} instance, the following conversions take place:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{BabylMessage} state}
\lineii{R flag and O flag}{no "unseen" label}
\lineii{O flag}{"unseen" label}
\lineii{D flag}{"deleted" label}
\lineii{A flag}{"answered" label}
\end{tableii}

When an \class{MMDFMessage} instance is created based upon an
\class{mboxMessage} instance, the "From~" line is copied and all flags directly
correspond:

\begin{tableii}{l|l}{textrm}
    {Resulting state}{\class{mboxMessage} state}
\lineii{R flag}{R flag}
\lineii{O flag}{O flag}
\lineii{D flag}{D flag}
\lineii{F flag}{F flag}
\lineii{A flag}{A flag}
\end{tableii}

\subsection{Exceptions}
\label{mailbox-deprecated}

The following exception classes are defined in the \module{mailbox} module:

\begin{classdesc}{Error}{}
The based class for all other module-specific exceptions.
\end{classdesc}

\begin{classdesc}{NoSuchMailboxError}{}
Raised when a mailbox is expected but is not found, such as when instantiating
a \class{Mailbox} subclass with a path that does not exist (and with the
\var{create} parameter set to \code{False}), or when opening a folder that does
not exist.
\end{classdesc}

\begin{classdesc}{NotEmptyErrorError}{}
Raised when a mailbox is not empty but is expected to be, such as when deleting
a folder that contains messages.
\end{classdesc}

\begin{classdesc}{ExternalClashError}{}
Raised when some mailbox-related condition beyond the control of the program
causes it to be unable to proceed, such as when failing to acquire a lock that
another program already holds a lock, or when a uniquely-generated file name
already exists.
\end{classdesc}

\begin{classdesc}{FormatError}{}
Raised when the data in a file cannot be parsed, such as when an \class{MH}
instance attempts to read a corrupted \file{.mh_sequences} file.
\end{classdesc}

\subsection{Deprecated classes and methods}
\label{mailbox-deprecated}

Older versions of the \module{mailbox} module do not support modification of
mailboxes, such as adding or removing message, and do not provide classes to
represent format-specific message properties. For backward compatibility, the
older mailbox classes are still available, but the newer classes should be used
in preference to them.

Older mailbox objects support only iteration and provide a single public
method:

\begin{methoddesc}{next}{}
Return the next message in the mailbox, created with the optional \var{factory}
argument passed into the mailbox object's constructor. By default this is an
\class{rfc822.Message} object (see the \refmodule{rfc822} module).  Depending
on the mailbox implementation the \var{fp} attribute of this object may be a
true file object or a class instance simulating a file object, taking care of
things like message boundaries if multiple mail messages are contained in a
single file, etc.  If no more messages are available, this method returns
\code{None}.
\end{methoddesc}

Most of the older mailbox classes have names that differ from the current
mailbox class names, except for \class{Maildir}. For this reason, the new
\class{Maildir} class defines a \method{next()} method and its constructor
differs slightly from those of the other new mailbox classes.

The older mailbox classes whose names are not the same as their newer
counterparts are as follows:
1251

1252
\begin{classdesc}{UnixMailbox}{fp\optional{, factory}}
1253 1254 1255 1256 1257 1258 1259
Access to a classic \UNIX-style mailbox, where all messages are
contained in a single file and separated by \samp{From }
(a.k.a.\ \samp{From_}) lines.  The file object \var{fp} points to the
mailbox file.  The optional \var{factory} parameter is a callable that
should create new message objects.  \var{factory} is called with one
argument, \var{fp} by the \method{next()} method of the mailbox
object.  The default is the \class{rfc822.Message} class (see the
1260
\refmodule{rfc822} module -- and the note below).
1261

Fred Drake's avatar
Fred Drake committed
1262 1263 1264 1265 1266
\begin{notice}
  For reasons of this module's internal implementation, you will
  probably want to open the \var{fp} object in binary mode.  This is
  especially important on Windows.
\end{notice}
1267

1268 1269 1270 1271 1272 1273 1274
For maximum portability, messages in a \UNIX-style mailbox are
separated by any line that begins exactly with the string \code{'From
'} (note the trailing space) if preceded by exactly two newlines.
Because of the wide-range of variations in practice, nothing else on
the From_ line should be considered.  However, the current
implementation doesn't check for the leading two newlines.  This is
usually fine for most applications.
1275 1276 1277 1278

The \class{UnixMailbox} class implements a more strict version of
From_ line checking, using a regular expression that usually correctly
matched From_ delimiters.  It considers delimiter line to be separated
1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294
by \samp{From \var{name} \var{time}} lines.  For maximum portability,
use the \class{PortableUnixMailbox} class instead.  This class is
identical to \class{UnixMailbox} except that individual messages are
separated by only \samp{From } lines.

For more information, see
\citetitle[http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html]{Configuring
Netscape Mail on \UNIX: Why the Content-Length Format is Bad}.
\end{classdesc}

\begin{classdesc}{PortableUnixMailbox}{fp\optional{, factory}}
A less-strict version of \class{UnixMailbox}, which considers only the
\samp{From } at the beginning of the line separating messages.  The
``\var{name} \var{time}'' portion of the From line is ignored, to
protect against some variations that are observed in practice.  This
works since lines in the message which begin with \code{'From '} are
1295
quoted by mail handling software at delivery-time.
1296
\end{classdesc}
1297

1298
\begin{classdesc}{MmdfMailbox}{fp\optional{, factory}}
1299 1300
Access an MMDF-style mailbox, where all messages are contained
in a single file and separated by lines consisting of 4 control-A
1301
characters.  The file object \var{fp} points to the mailbox file.
1302
Optional \var{factory} is as with the \class{UnixMailbox} class.
1303
\end{classdesc}
1304

1305
\begin{classdesc}{MHMailbox}{dirname\optional{, factory}}
1306
Access an MH mailbox, a directory with each message in a separate
1307 1308
file with a numeric name.
The name of the mailbox directory is passed in \var{dirname}.
1309
\var{factory} is as with the \class{UnixMailbox} class.
1310
\end{classdesc}
1311

1312
\begin{classdesc}{BabylMailbox}{fp\optional{, factory}}
1313 1314 1315
Access a Babyl mailbox, which is similar to an MMDF mailbox.  In
Babyl format, each message has two sets of headers, the
\emph{original} headers and the \emph{visible} headers.  The original
1316
headers appear before a line containing only \code{'*** EOOH ***'}
1317 1318 1319 1320 1321 1322 1323 1324
(End-Of-Original-Headers) and the visible headers appear after the
\code{EOOH} line.  Babyl-compliant mail readers will show you only the
visible headers, and \class{BabylMailbox} objects will return messages
containing only the visible headers.  You'll have to do your own
parsing of the mailbox file to get at the original headers.  Mail
messages start with the EOOH line and end with a line containing only
\code{'\e{}037\e{}014'}.  \var{factory} is as with the
\class{UnixMailbox} class.
1325 1326
\end{classdesc}

1327 1328
If you wish to use the older mailbox classes with the \module{email} module
rather than the deprecated \module{rfc822} module, you can do so as follows:
1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339

\begin{verbatim}
import email
import email.Errors
import mailbox

def msgfactory(fp):
    try:
        return email.message_from_file(fp)
    except email.Errors.MessageParseError:
        # Don't return None since that will
1340 1341
        # stop the mailbox iterator
        return ''
1342 1343 1344 1345

mbox = mailbox.UnixMailbox(fp, msgfactory)
\end{verbatim}

1346 1347
Alternatively, if you know your mailbox contains only well-formed MIME
messages, you can simplify this to:
1348 1349 1350 1351 1352 1353 1354

\begin{verbatim}
import email
import mailbox

mbox = mailbox.UnixMailbox(fp, email.message_from_file)
\end{verbatim}
1355

1356 1357
\subsection{Examples}
\label{mailbox-examples}
1358

1359 1360
A simple example of printing the subjects of all messages in a mailbox that
seem interesting:
1361

1362 1363 1364 1365 1366 1367 1368
\begin{verbatim}
import mailbox
for message in mailbox.mbox('~/mbox'):
    subject = message['subject']       # Could possibly be None.
    if subject and 'python' in subject.lower():
        print subject
\end{verbatim}
1369

Andrew M. Kuchling's avatar
Andrew M. Kuchling committed
1370 1371
To copy all mail from a Babyl mailbox to an MH mailbox, converting all
of the format-specific information that can be converted:
1372

1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408
\begin{verbatim}
import mailbox
destination = mailbox.MH('~/Mail')
for message in mailbox.Babyl('~/RMAIL'):
    destination.add(MHMessage(message))
\end{verbatim}

An example of sorting mail from numerous mailing lists, being careful to avoid
mail corruption due to concurrent modification by other programs, mail loss due
to interruption of the program, or premature termination due to malformed
messages in the mailbox:

\begin{verbatim}
import mailbox
import email.Errors
list_names = ('python-list', 'python-dev', 'python-bugs')
boxes = dict((name, mailbox.mbox('~/email/%s' % name)) for name in list_names)
inbox = mailbox.Maildir('~/Maildir', None)
for key in inbox.iterkeys():
    try:
        message = inbox[key]
    except email.Errors.MessageParseError:
        continue                # The message is malformed. Just leave it.
    for name in list_names:
        list_id = message['list-id']
        if list_id and name in list_id:
            box = boxes[name]
            box.lock()
            box.add(message)
            box.flush()         # Write copy to disk before removing original.
            box.unlock()
            inbox.discard(key)
            break               # Found destination, so stop looking.
for box in boxes.itervalues():
    box.close()
\end{verbatim}