csv.rst



:mod:`csv` --- CSV File Reading and Writing
Source code: :source:`Lib/csv.py`

The so-called CSV (Comma Separated Values) format is the most common import and
export format for spreadsheets and databases.  CSV format was used for many
years prior to attempts to describe the format in a standardized way in
RFC 4180.  The lack of a well-defined standard means that subtle differences
often exist in the data produced and consumed by different applications.  These
differences can make it annoying to process CSV files from multiple sources.
Still, while the delimiters and quoting characters vary, the overall format is
similar enough that it is possible to write a single module which can
efficiently manipulate such data, hiding the details of reading and writing the
data from the programmer.
The :mod:`csv` module implements classes to read and write tabular data in CSV
format.  It allows programmers to say, "write this data in the format preferred
by Excel," or "read data from this file which was generated by Excel," without
knowing the precise details of the CSV format used by Excel.  Programmers can
also describe the CSV formats understood by other applications or define their
own special-purpose CSV formats.
The :mod:`csv` module's :class:`reader` and :class:`writer` objects read and
write sequences.  Programmers can also read and write data in dictionary form
using the :class:`DictReader` and :class:`DictWriter` classes.

Module Contents
The :mod:`csv` module defines the following functions:
The :mod:`csv` module defines the following classes:
The :class:`Dialect` class is a container class relied on primarily for its
attributes, which are used to define the parameters for a specific
:class:`reader` or :class:`writer` instance.
The :class:`excel` class defines the usual properties of an Excel-generated CSV
file.  It is registered with the dialect name 'excel'.
The :class:`excel_tab` class defines the usual properties of an Excel-generated
TAB-delimited file.  It is registered with the dialect name 'excel-tab'.
The :class:`unix_dialect` class defines the usual properties of a CSV file
generated on UNIX systems, i.e. using '\n' as line terminator and quoting
all fields.  It is registered with the dialect name 'unix'.
The :class:`Sniffer` class is used to deduce the format of a CSV file.
The :class:`Sniffer` class provides two methods:
An example for :class:`Sniffer` use:
with open('example.csv') as csvfile:
    dialect = csv.Sniffer().sniff(csvfile.read(1024))
    csvfile.seek(0)
    reader = csv.reader(csvfile, dialect)
    # ... process CSV file contents here ...

The :mod:`csv` module defines the following constants:
The :mod:`csv` module defines the following exception:

Dialects and Formatting Parameters
To make it easier to specify the format of input and output records, specific
formatting parameters are grouped together into dialects.  A dialect is a
subclass of the :class:`Dialect` class having a set of specific methods and a
single :meth:`validate` method.  When creating :class:`reader` or
:class:`writer` objects, the programmer can specify a string or a subclass of
the :class:`Dialect` class as the dialect parameter.  In addition to, or instead
of, the dialect parameter, the programmer can also specify individual
formatting parameters, which have the same names as the attributes defined below
for the :class:`Dialect` class.
Dialects support the following attributes:

Reader Objects
Reader objects (:class:`DictReader` instances and objects returned by the
:func:`reader` function) have the following public methods:
Reader objects have the following public attributes:
DictReader objects have the following public attribute:

Writer Objects
:class:`Writer` objects (:class:`DictWriter` instances and objects returned by
the :func:`writer` function) have the following public methods.  A row must be
an iterable of strings or numbers for :class:`Writer` objects and a dictionary
mapping fieldnames to strings or numbers (by passing them through :func:`str`
first) for :class:`DictWriter` objects.  Note that complex numbers are written
out surrounded by parens. This may cause some problems for other programs which
read CSV files (assuming they support complex numbers at all).
Writer objects have the following public attribute:
DictWriter objects have the following public method:

Examples
The simplest example of reading a CSV file:
import csv
with open('some.csv', newline='') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

Reading a file with an alternate format:
import csv
with open('passwd', newline='') as f:
    reader = csv.reader(f, delimiter=':', quoting=csv.QUOTE_NONE)
    for row in reader:
        print(row)

The corresponding simplest possible writing example is:
import csv
with open('some.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(someiterable)

Since :func:`open` is used to open a CSV file for reading, the file
will by default be decoded into unicode using the system default
encoding (see :func:`locale.getpreferredencoding`).  To decode a file
using a different encoding, use the encoding argument of open:
import csv
with open('some.csv', newline='', encoding='utf-8') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

The same applies to writing in something other than the system default
encoding: specify the encoding argument when opening the output file.
Registering a new dialect:
import csv
csv.register_dialect('unixpwd', delimiter=':', quoting=csv.QUOTE_NONE)
with open('passwd', newline='') as f:
    reader = csv.reader(f, 'unixpwd')

A slightly more advanced use of the reader --- catching and reporting errors:
import csv, sys
filename = 'some.csv'
with open(filename, newline='') as f:
    reader = csv.reader(f)
    try:
        for row in reader:
            print(row)
    except csv.Error as e:
        sys.exit('file {}, line {}: {}'.format(filename, reader.line_num, e))

And while the module doesn't directly support parsing strings, it can easily be
done:
import csv
for row in csv.reader(['one,two,three']):
    print(row)

Footnotes


[1]
If newline='' is not specified, newlines embedded inside quoted fields
will not be interpreted correctly, and on platforms that use \r\n linendings
on write an extra \r will be added.  It should always be safe to specify
newline='', since the csv module does its own
(:term:`universal <universal newlines>`) newline handling.