Kaydet (Commit) c585df94 authored tarafından Facundo Batista's avatar Facundo Batista

Issue 600362: Relocated parse_qs() and parse_qsl(), from the cgi module

to the urlparse one.  Added a PendingDeprecationWarning in the old
module, it will be deprecated in the future.  Docs and tests updated.
üst 69acb433
...@@ -282,49 +282,18 @@ algorithms implemented in this module in other circumstances. ...@@ -282,49 +282,18 @@ algorithms implemented in this module in other circumstances.
Parse a query in the environment or from a file (the file defaults to Parse a query in the environment or from a file (the file defaults to
``sys.stdin``). The *keep_blank_values* and *strict_parsing* parameters are ``sys.stdin``). The *keep_blank_values* and *strict_parsing* parameters are
passed to :func:`parse_qs` unchanged. passed to :func:`urlparse.parse_qs` unchanged.
.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]]) .. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]])
Parse a query string given as a string argument (data of type This function is deprecated in this module. Use :func:`urlparse.parse_qs`
:mimetype:`application/x-www-form-urlencoded`). Data are returned as a instead. It is maintained here only for backward compatiblity.
dictionary. The dictionary keys are the unique query variable names and the
values are lists of values for each name.
The optional argument *keep_blank_values* is a flag indicating whether blank
values in URL encoded queries should be treated as blank strings. A true value
indicates that blanks should be retained as blank strings. The default false
value indicates that blank values are to be ignored and treated as if they were
not included.
The optional argument *strict_parsing* is a flag indicating what to do with
parsing errors. If false (the default), errors are silently ignored. If true,
errors raise a :exc:`ValueError` exception.
Use the :func:`urllib.urlencode` function to convert such dictionaries into
query strings.
.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]]) .. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]])
Parse a query string given as a string argument (data of type This function is deprecated in this module. Use :func:`urlparse.parse_qsl`
:mimetype:`application/x-www-form-urlencoded`). Data are returned as a list of instead. It is maintained here only for backward compatiblity.
name, value pairs.
The optional argument *keep_blank_values* is a flag indicating whether blank
values in URL encoded queries should be treated as blank strings. A true value
indicates that blanks should be retained as blank strings. The default false
value indicates that blank values are to be ignored and treated as if they were
not included.
The optional argument *strict_parsing* is a flag indicating what to do with
parsing errors. If false (the default), errors are silently ignored. If true,
errors raise a :exc:`ValueError` exception.
Use the :func:`urllib.urlencode` function to convert such lists of pairs into
query strings.
.. function:: parse_multipart(fp, pdict) .. function:: parse_multipart(fp, pdict)
...@@ -332,7 +301,7 @@ algorithms implemented in this module in other circumstances. ...@@ -332,7 +301,7 @@ algorithms implemented in this module in other circumstances.
Arguments are *fp* for the input file and *pdict* for a dictionary containing Arguments are *fp* for the input file and *pdict* for a dictionary containing
other parameters in the :mailheader:`Content-Type` header. other parameters in the :mailheader:`Content-Type` header.
Returns a dictionary just like :func:`parse_qs` keys are the field names, each Returns a dictionary just like :func:`urlparse.parse_qs` keys are the field names, each
value is a list of values for that field. This is easy to use but not much good value is a list of values for that field. This is easy to use but not much good
if you are expecting megabytes to be uploaded --- in that case, use the if you are expecting megabytes to be uploaded --- in that case, use the
:class:`FieldStorage` class instead which is much more flexible. :class:`FieldStorage` class instead which is much more flexible.
......
...@@ -242,7 +242,7 @@ Utility functions ...@@ -242,7 +242,7 @@ Utility functions
of the sequence. When a sequence of two-element tuples is used as the *query* of the sequence. When a sequence of two-element tuples is used as the *query*
argument, the first element of each tuple is a key and the second is a value. argument, the first element of each tuple is a key and the second is a value.
The order of parameters in the encoded string will match the order of parameter The order of parameters in the encoded string will match the order of parameter
tuples in the sequence. The :mod:`cgi` module provides the functions tuples in the sequence. The :mod:`urlparse` module provides the functions
:func:`parse_qs` and :func:`parse_qsl` which are used to parse query strings :func:`parse_qs` and :func:`parse_qsl` which are used to parse query strings
into Python data structures. into Python data structures.
......
...@@ -101,6 +101,45 @@ The :mod:`urlparse` module defines the following functions: ...@@ -101,6 +101,45 @@ The :mod:`urlparse` module defines the following functions:
.. versionchanged:: 2.5 .. versionchanged:: 2.5
Added attributes to return value. Added attributes to return value.
.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]])
Parse a query string given as a string argument (data of type
:mimetype:`application/x-www-form-urlencoded`). Data are returned as a
dictionary. The dictionary keys are the unique query variable names and the
values are lists of values for each name.
The optional argument *keep_blank_values* is a flag indicating whether blank
values in URL encoded queries should be treated as blank strings. A true value
indicates that blanks should be retained as blank strings. The default false
value indicates that blank values are to be ignored and treated as if they were
not included.
The optional argument *strict_parsing* is a flag indicating what to do with
parsing errors. If false (the default), errors are silently ignored. If true,
errors raise a :exc:`ValueError` exception.
Use the :func:`urllib.urlencode` function to convert such dictionaries into
query strings.
.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]])
Parse a query string given as a string argument (data of type
:mimetype:`application/x-www-form-urlencoded`). Data are returned as a list of
name, value pairs.
The optional argument *keep_blank_values* is a flag indicating whether blank
values in URL encoded queries should be treated as blank strings. A true value
indicates that blanks should be retained as blank strings. The default false
value indicates that blank values are to be ignored and treated as if they were
not included.
The optional argument *strict_parsing* is a flag indicating what to do with
parsing errors. If false (the default), errors are silently ignored. If true,
errors raise a :exc:`ValueError` exception.
Use the :func:`urllib.urlencode` function to convert such lists of pairs into
query strings.
.. function:: urlunparse(parts) .. function:: urlunparse(parts)
......
...@@ -39,7 +39,9 @@ import sys ...@@ -39,7 +39,9 @@ import sys
import os import os
import urllib import urllib
import UserDict import UserDict
from warnings import filterwarnings, catch_warnings import urlparse
from warnings import filterwarnings, catch_warnings, warn
with catch_warnings(): with catch_warnings():
if sys.py3kwarning: if sys.py3kwarning:
filterwarnings("ignore", ".*mimetools has been removed", filterwarnings("ignore", ".*mimetools has been removed",
...@@ -173,72 +175,21 @@ def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0): ...@@ -173,72 +175,21 @@ def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0):
return parse_qs(qs, keep_blank_values, strict_parsing) return parse_qs(qs, keep_blank_values, strict_parsing)
def parse_qs(qs, keep_blank_values=0, strict_parsing=0): # parse query string function called from urlparse,
"""Parse a query given as a string argument. # this is done in order to maintain backward compatiblity.
Arguments:
qs: URL-encoded query string to be parsed
keep_blank_values: flag indicating whether blank values in def parse_qs(qs, keep_blank_values=0, strict_parsing=0):
URL encoded queries should be treated as blank strings. """Parse a query given as a string argument."""
A true value indicates that blanks should be retained as warn("cgi.parse_qs is deprecated, use urlparse.parse_qs \
blank strings. The default false value indicates that instead",PendingDeprecationWarning)
blank values are to be ignored and treated as if they were return urlparse.parse_qs(qs, keep_blank_values, strict_parsing)
not included.
strict_parsing: flag indicating what to do with parsing errors.
If false (the default), errors are silently ignored.
If true, errors raise a ValueError exception.
"""
dict = {}
for name, value in parse_qsl(qs, keep_blank_values, strict_parsing):
if name in dict:
dict[name].append(value)
else:
dict[name] = [value]
return dict
def parse_qsl(qs, keep_blank_values=0, strict_parsing=0): def parse_qsl(qs, keep_blank_values=0, strict_parsing=0):
"""Parse a query given as a string argument. """Parse a query given as a string argument."""
warn("cgi.parse_qsl is deprecated, use urlparse.parse_qsl instead",
Arguments: PendingDeprecationWarning)
return urlparse.parse_qs(qs, keep_blank_values, strict_parsing)
qs: URL-encoded query string to be parsed
keep_blank_values: flag indicating whether blank values in
URL encoded queries should be treated as blank strings. A
true value indicates that blanks should be retained as blank
strings. The default false value indicates that blank values
are to be ignored and treated as if they were not included.
strict_parsing: flag indicating what to do with parsing errors. If
false (the default), errors are silently ignored. If true,
errors raise a ValueError exception.
Returns a list, as G-d intended.
"""
pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
r = []
for name_value in pairs:
if not name_value and not strict_parsing:
continue
nv = name_value.split('=', 1)
if len(nv) != 2:
if strict_parsing:
raise ValueError, "bad query field: %r" % (name_value,)
# Handle case of a control-name with no equal sign
if keep_blank_values:
nv.append('')
else:
continue
if len(nv[1]) or keep_blank_values:
name = urllib.unquote(nv[0].replace('+', ' '))
value = urllib.unquote(nv[1].replace('+', ' '))
r.append((name, value))
return r
def parse_multipart(fp, pdict): def parse_multipart(fp, pdict):
"""Parse multipart input. """Parse multipart input.
...@@ -645,8 +596,8 @@ class FieldStorage: ...@@ -645,8 +596,8 @@ class FieldStorage:
if self.qs_on_post: if self.qs_on_post:
qs += '&' + self.qs_on_post qs += '&' + self.qs_on_post
self.list = list = [] self.list = list = []
for key, value in parse_qsl(qs, self.keep_blank_values, for key, value in urlparse.parse_qsl(qs, self.keep_blank_values,
self.strict_parsing): self.strict_parsing):
list.append(MiniFieldStorage(key, value)) list.append(MiniFieldStorage(key, value))
self.skip_lines() self.skip_lines()
...@@ -659,8 +610,8 @@ class FieldStorage: ...@@ -659,8 +610,8 @@ class FieldStorage:
raise ValueError, 'Invalid boundary in multipart form: %r' % (ib,) raise ValueError, 'Invalid boundary in multipart form: %r' % (ib,)
self.list = [] self.list = []
if self.qs_on_post: if self.qs_on_post:
for key, value in parse_qsl(self.qs_on_post, self.keep_blank_values, for key, value in urlparse.parse_qsl(self.qs_on_post,
self.strict_parsing): self.keep_blank_values, self.strict_parsing):
self.list.append(MiniFieldStorage(key, value)) self.list.append(MiniFieldStorage(key, value))
FieldStorageClass = None FieldStorageClass = None
......
...@@ -55,23 +55,6 @@ def do_test(buf, method): ...@@ -55,23 +55,6 @@ def do_test(buf, method):
except StandardError, err: except StandardError, err:
return ComparableException(err) return ComparableException(err)
# A list of test cases. Each test case is a a two-tuple that contains
# a string with the query and a dictionary with the expected result.
parse_qsl_test_cases = [
("", []),
("&", []),
("&&", []),
("=", [('', '')]),
("=a", [('', 'a')]),
("a", [('a', '')]),
("a=", [('a', '')]),
("a=", [('a', '')]),
("&a=b", [('a', 'b')]),
("a=a+b&b=b+c", [('a', 'a b'), ('b', 'b c')]),
("a=1&a=2", [('a', '1'), ('a', '2')]),
]
parse_strict_test_cases = [ parse_strict_test_cases = [
("", ValueError("bad query field: ''")), ("", ValueError("bad query field: ''")),
("&", ValueError("bad query field: ''")), ("&", ValueError("bad query field: ''")),
...@@ -143,11 +126,6 @@ def gen_result(data, environ): ...@@ -143,11 +126,6 @@ def gen_result(data, environ):
class CgiTests(unittest.TestCase): class CgiTests(unittest.TestCase):
def test_qsl(self):
for orig, expect in parse_qsl_test_cases:
result = cgi.parse_qsl(orig, keep_blank_values=True)
self.assertEqual(result, expect, "Error parsing %s" % repr(orig))
def test_strict(self): def test_strict(self):
for orig, expect in parse_strict_test_cases: for orig, expect in parse_strict_test_cases:
# Test basic parsing # Test basic parsing
......
...@@ -8,6 +8,23 @@ RFC1808_BASE = "http://a/b/c/d;p?q#f" ...@@ -8,6 +8,23 @@ RFC1808_BASE = "http://a/b/c/d;p?q#f"
RFC2396_BASE = "http://a/b/c/d;p?q" RFC2396_BASE = "http://a/b/c/d;p?q"
RFC3986_BASE = "http://a/b/c/d;p?q" RFC3986_BASE = "http://a/b/c/d;p?q"
# A list of test cases. Each test case is a a two-tuple that contains
# a string with the query and a dictionary with the expected result.
parse_qsl_test_cases = [
("", []),
("&", []),
("&&", []),
("=", [('', '')]),
("=a", [('', 'a')]),
("a", [('a', '')]),
("a=", [('a', '')]),
("a=", [('a', '')]),
("&a=b", [('a', 'b')]),
("a=a+b&b=b+c", [('a', 'a b'), ('b', 'b c')]),
("a=1&a=2", [('a', '1'), ('a', '2')]),
]
class UrlParseTestCase(unittest.TestCase): class UrlParseTestCase(unittest.TestCase):
def checkRoundtrips(self, url, parsed, split): def checkRoundtrips(self, url, parsed, split):
...@@ -61,6 +78,11 @@ class UrlParseTestCase(unittest.TestCase): ...@@ -61,6 +78,11 @@ class UrlParseTestCase(unittest.TestCase):
self.assertEqual(result3.hostname, result.hostname) self.assertEqual(result3.hostname, result.hostname)
self.assertEqual(result3.port, result.port) self.assertEqual(result3.port, result.port)
def test_qsl(self):
for orig, expect in parse_qsl_test_cases:
result = urlparse.parse_qsl(orig, keep_blank_values=True)
self.assertEqual(result, expect, "Error parsing %s" % repr(orig))
def test_roundtrips(self): def test_roundtrips(self):
testcases = [ testcases = [
('file:///tmp/junk.txt', ('file:///tmp/junk.txt',
......
...@@ -5,7 +5,7 @@ UC Irvine, June 1995. ...@@ -5,7 +5,7 @@ UC Irvine, June 1995.
""" """
__all__ = ["urlparse", "urlunparse", "urljoin", "urldefrag", __all__ = ["urlparse", "urlunparse", "urljoin", "urldefrag",
"urlsplit", "urlunsplit"] "urlsplit", "urlunsplit", "parse_qs", "parse_qsl"]
# A classification of schemes ('' means apply by default) # A classification of schemes ('' means apply by default)
uses_relative = ['ftp', 'http', 'gopher', 'nntp', 'imap', uses_relative = ['ftp', 'http', 'gopher', 'nntp', 'imap',
...@@ -267,6 +267,92 @@ def urldefrag(url): ...@@ -267,6 +267,92 @@ def urldefrag(url):
else: else:
return url, '' return url, ''
# unquote method for parse_qs and parse_qsl
# Cannot use directly from urllib as it would create circular reference.
# urllib uses urlparse methods ( urljoin)
_hextochr = dict(('%02x' % i, chr(i)) for i in range(256))
_hextochr.update(('%02X' % i, chr(i)) for i in range(256))
def unquote(s):
"""unquote('abc%20def') -> 'abc def'."""
res = s.split('%')
for i in xrange(1, len(res)):
item = res[i]
try:
res[i] = _hextochr[item[:2]] + item[2:]
except KeyError:
res[i] = '%' + item
except UnicodeDecodeError:
res[i] = unichr(int(item[:2], 16)) + item[2:]
return "".join(res)
def parse_qs(qs, keep_blank_values=0, strict_parsing=0):
"""Parse a query given as a string argument.
Arguments:
qs: URL-encoded query string to be parsed
keep_blank_values: flag indicating whether blank values in
URL encoded queries should be treated as blank strings.
A true value indicates that blanks should be retained as
blank strings. The default false value indicates that
blank values are to be ignored and treated as if they were
not included.
strict_parsing: flag indicating what to do with parsing errors.
If false (the default), errors are silently ignored.
If true, errors raise a ValueError exception.
"""
dict = {}
for name, value in parse_qsl(qs, keep_blank_values, strict_parsing):
if name in dict:
dict[name].append(value)
else:
dict[name] = [value]
return dict
def parse_qsl(qs, keep_blank_values=0, strict_parsing=0):
"""Parse a query given as a string argument.
Arguments:
qs: URL-encoded query string to be parsed
keep_blank_values: flag indicating whether blank values in
URL encoded queries should be treated as blank strings. A
true value indicates that blanks should be retained as blank
strings. The default false value indicates that blank values
are to be ignored and treated as if they were not included.
strict_parsing: flag indicating what to do with parsing errors. If
false (the default), errors are silently ignored. If true,
errors raise a ValueError exception.
Returns a list, as G-d intended.
"""
pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
r = []
for name_value in pairs:
if not name_value and not strict_parsing:
continue
nv = name_value.split('=', 1)
if len(nv) != 2:
if strict_parsing:
raise ValueError, "bad query field: %r" % (name_value,)
# Handle case of a control-name with no equal sign
if keep_blank_values:
nv.append('')
else:
continue
if len(nv[1]) or keep_blank_values:
name = unquote(nv[0].replace('+', ' '))
value = unquote(nv[1].replace('+', ' '))
r.append((name, value))
return r
test_input = """ test_input = """
http://a/b/c/d http://a/b/c/d
......
...@@ -56,6 +56,10 @@ C-API ...@@ -56,6 +56,10 @@ C-API
Library Library
------- -------
- Issue 600362: Relocated parse_qs() and parse_qsl(), from the cgi module
to the urlparse one. Added a PendingDeprecationWarning in the old
module, it will be deprecated in the future.
- Issue #2562: Fix distutils PKG-INFO writing logic to allow having - Issue #2562: Fix distutils PKG-INFO writing logic to allow having
non-ascii characters and Unicode in setup.py meta-data. non-ascii characters and Unicode in setup.py meta-data.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment