Kaydet (Commit) 266a436f authored tarafından Martin v. Löwis's avatar Martin v. Löwis

Remove claims that Python source code is ASCII. Fixes #1026038.

üst 879ddf30
...@@ -73,6 +73,8 @@ Comments are ignored by the syntax; they are not tokens. ...@@ -73,6 +73,8 @@ Comments are ignored by the syntax; they are not tokens.
\subsection{Encoding declarations\label{encodings}} \subsection{Encoding declarations\label{encodings}}
\index{source character set}
\index{encodings}
If a comment in the first or second line of the Python script matches If a comment in the first or second line of the Python script matches
the regular expression \regexp{coding[=:]\e s*([-\e w.]+)}, this comment is the regular expression \regexp{coding[=:]\e s*([-\e w.]+)}, this comment is
...@@ -385,16 +387,18 @@ String literals are described by the following lexical definitions: ...@@ -385,16 +387,18 @@ String literals are described by the following lexical definitions:
\production{longstringitem} \production{longstringitem}
{\token{longstringchar} | \token{escapeseq}} {\token{longstringchar} | \token{escapeseq}}
\production{shortstringchar} \production{shortstringchar}
{<any ASCII character except "\e" or newline or the quote>} {<any source character except "\e" or newline or the quote>}
\production{longstringchar} \production{longstringchar}
{<any ASCII character except "\e">} {<any source character except "\e">}
\production{escapeseq} \production{escapeseq}
{"\e" <any ASCII character>} {"\e" <any ASCII character>}
\end{productionlist} \end{productionlist}
One syntactic restriction not indicated by these productions is that One syntactic restriction not indicated by these productions is that
whitespace is not allowed between the \grammartoken{stringprefix} and whitespace is not allowed between the \grammartoken{stringprefix} and
the rest of the string literal. the rest of the string literal. The source character set is defined
by the encoding declaration; it is \ASCII if no encoding declaration
is given in the source file; see \ref{encodings}.
\index{triple-quoted string} \index{triple-quoted string}
\index{Unicode Consortium} \index{Unicode Consortium}
...@@ -447,8 +451,8 @@ to those used by Standard C. The recognized escape sequences are: ...@@ -447,8 +451,8 @@ to those used by Standard C. The recognized escape sequences are:
\lineiii{\e U\var{xxxxxxxx}} \lineiii{\e U\var{xxxxxxxx}}
{Character with 32-bit hex value \var{xxxxxxxx} (Unicode only)}{(2)} {Character with 32-bit hex value \var{xxxxxxxx} (Unicode only)}{(2)}
\lineiii{\e v} {\ASCII{} Vertical Tab (VT)}{} \lineiii{\e v} {\ASCII{} Vertical Tab (VT)}{}
\lineiii{\e\var{ooo}} {\ASCII{} character with octal value \var{ooo}}{(3)} \lineiii{\e\var{ooo}} {Character with octal value \var{ooo}}{(3,5)}
\lineiii{\e x\var{hh}} {\ASCII{} character with hex value \var{hh}}{(4)} \lineiii{\e x\var{hh}} {Character with hex value \var{hh}}{(4,5)}
\end{tableiii} \end{tableiii}
\index{ASCII@\ASCII} \index{ASCII@\ASCII}
...@@ -469,6 +473,12 @@ Notes: ...@@ -469,6 +473,12 @@ Notes:
As in Standard C, up to three octal digits are accepted. As in Standard C, up to three octal digits are accepted.
\item[(4)] \item[(4)]
Unlike in Standard C, at most two hex digits are accepted. Unlike in Standard C, at most two hex digits are accepted.
\item[(5)]
In a string literal, hexadecimal and octal escapes denote the
byte with the given value; it is not necessary that the byte
encodes a character in the source character set. In a Unicode
literal, these escapes denote a Unicode character with the given
value.
\end{itemize} \end{itemize}
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment