Closes #23181: codepoint -> code point

3be472b5 · Georg Brandl · 1a8ada89 · 3be472b5 · 3be472b5 · 3be472b5
Kaydet (Commit) 3be472b5 authored Ock 14, 2015 tarafından Georg Brandl
7 changed files
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1141,7 +1141,7 @@ These are the UTF-32 codec APIs:
   mark (U+FEFF). In the other two modes, no BOM mark is prepended.
   If *Py_UNICODE_WIDE* is not defined, surrogate pairs will be output
-   as a single codepoint.
+   as a single code point.
   Return *NULL* if an exception was raised by the codec.

--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -841,7 +841,7 @@ methods and attributes from the underlying stream.
 Encodings and Unicode
 ---------------------
-Strings are stored internally as sequences of codepoints in
+Strings are stored internally as sequences of code points in
 range ``0x0``-``0x10FFFF``.  (See :pep:`393` for
 more details about the implementation.)
 Once a string object is used outside of CPU and memory, endianness
@@ -852,23 +852,23 @@ There are a variety of different text serialisation codecs, which are
 collectivity referred to as :term:`text encodings <text encoding>`.
 The simplest text encoding (called ``'latin-1'`` or ``'iso-8859-1'``) maps
-the codepoints 0-255 to the bytes ``0x0``-``0xff``, which means that a string
+the code points 0-255 to the bytes ``0x0``-``0xff``, which means that a string
-object that contains codepoints above ``U+00FF`` can't be encoded with this
+object that contains code points above ``U+00FF`` can't be encoded with this
 codec. Doing so will raise a :exc:`UnicodeEncodeError` that looks
 like the following (although the details of the error message may differ):
 ``UnicodeEncodeError: 'latin-1' codec can't encode character '\u1234' in
 position 3: ordinal not in range(256)``.
 There's another group of encodings (the so called charmap encodings) that choose
-a different subset of all Unicode code points and how these codepoints are
+a different subset of all Unicode code points and how these code points are
 mapped to the bytes ``0x0``-``0xff``. To see how this is done simply open
 e.g. :file:`encodings/cp1252.py` (which is an encoding that is used primarily on
 Windows). There's a string constant with 256 characters that shows you which
 character is mapped to which byte value.
-All of these encodings can only encode 256 of the 1114112 codepoints
+All of these encodings can only encode 256 of the 1114112 code points
 defined in Unicode. A simple and straightforward way that can store each Unicode
-code point, is to store each codepoint as four consecutive bytes. There are two
+code point, is to store each code point as four consecutive bytes. There are two
 possibilities: store the bytes in big endian or in little endian order. These
 two encodings are called ``UTF-32-BE`` and ``UTF-32-LE`` respectively. Their
 disadvantage is that if e.g. you use ``UTF-32-BE`` on a little endian machine you

--- a/Doc/library/email.mime.rst
+++ b/Doc/library/email.mime.rst
@@ -194,7 +194,7 @@ Here are the classes:
   minor type and defaults to :mimetype:`plain`.  *_charset* is the character
   set of the text and is passed as an argument to the
   :class:`~email.mime.nonmultipart.MIMENonMultipart` constructor; it defaults
-   to ``us-ascii`` if the string contains only ``ascii`` codepoints, and
+   to ``us-ascii`` if the string contains only ``ascii`` code points, and
   ``utf-8`` otherwise.  The *_charset* parameter accepts either a string or a
   :class:`~email.charset.Charset` instance.

--- a/Doc/library/functions.rst
+++ b/Doc/library/functions.rst
@@ -156,7 +156,7 @@ are always available.  They are listed here in alphabetical order.
 .. function:: chr(i)
-   Return the string representing a character whose Unicode codepoint is the
+   Return the string representing a character whose Unicode code point is the
   integer *i*.  For example, ``chr(97)`` returns the string ``'a'``, while
   ``chr(931)`` returns the string ``'Σ'``. This is the inverse of :func:`ord`.

--- a/Doc/library/html.entities.rst
+++ b/Doc/library/html.entities.rst
@@ -33,12 +33,12 @@ This module defines four dictionaries, :data:`html5`,
 .. data:: name2codepoint
-   A dictionary that maps HTML entity names to the Unicode codepoints.
+   A dictionary that maps HTML entity names to the Unicode code points.
 .. data:: codepoint2name
-   A dictionary that maps Unicode codepoints to HTML entity names.
+   A dictionary that maps Unicode code points to HTML entity names.
 .. rubric:: Footnotes

--- a/Doc/tutorial/datastructures.rst
+++ b/Doc/tutorial/datastructures.rst
@@ -685,7 +685,7 @@ the same type, the lexicographical comparison is carried out recursively.  If
 all items of two sequences compare equal, the sequences are considered equal.
 If one sequence is an initial sub-sequence of the other, the shorter sequence is
 the smaller (lesser) one.  Lexicographical ordering for strings uses the Unicode
-codepoint number to order individual characters.  Some examples of comparisons
+code point number to order individual characters.  Some examples of comparisons
 between sequences of the same type::
   (1, 2, 3)              < (1, 2, 4)

--- a/Doc/whatsnew/3.3.rst
+++ b/Doc/whatsnew/3.3.rst
@@ -228,7 +228,7 @@ Functionality
 Changes introduced by :pep:`393` are the following:
-* Python now always supports the full range of Unicode codepoints, including
+* Python now always supports the full range of Unicode code points, including
  non-BMP ones (i.e. from ``U+0000`` to ``U+10FFFF``).  The distinction between
  narrow and wide builds no longer exists and Python now behaves like a wide
  build, even under Windows.
@@ -246,7 +246,7 @@ Changes introduced by :pep:`393` are the following:
    so ``'\U0010FFFF'[0]`` now returns ``'\U0010FFFF'`` and not ``'\uDBFF'``;
  * all other functions in the standard library now correctly handle
-    non-BMP codepoints.
+    non-BMP code points.
 * The value of :data:`sys.maxunicode` is now always ``1114111`` (``0x10FFFF``
  in hexadecimal).  The :c:func:`PyUnicode_GetMax` function still returns
@@ -258,13 +258,13 @@ Changes introduced by :pep:`393` are the following:
 Performance and resource usage
 ------------------------------
-The storage of Unicode strings now depends on the highest codepoint in the string:
+The storage of Unicode strings now depends on the highest code point in the string:
-* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per codepoint;
+* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per code point;
-* BMP strings (``U+0000-U+FFFF``) use 2 bytes per codepoint;
+* BMP strings (``U+0000-U+FFFF``) use 2 bytes per code point;
-* non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per codepoint.
+* non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per code point.
 The net effect is that for most applications, memory usage of string
 storage should decrease significantly - especially compared to former