Python Forum

Full Version: Strange behavior of parse_qsl when parameter value is '+'
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I am seeing unexpected behavior from the parse_qsl function when the query string parameter value is just '+'.

My Python version is 3.11. I am calling parse_qsl to parse query strings from URLs. It works as expected for normal parameter values, but returns an unexpected result when the value is '+'.

Here are some examples of what it returns for different queries:
from urllib.parse import parse_qsl

parse_qsl('username=')  # []
parse_qsl('username= ')  # [('username', ' ')]
parse_qsl('username=1+2')  # [('username', '1 2')]
parse_qsl('username=123')  # [('username', '123')] 
parse_qsl('username=123&username=234')  # [('username', '123'), ('username', '234')]
However, when the value is just '+', it returns the tuple [('username', ' ')]:
parse_qsl('username=+')  # [('username', ' ')]
This seems strange to me, as '+' on its own does not represent a space. I would expect it to return [('username', '+')] or perhaps not parse it at all.

I have searched the Python and urllib documentation but did not find any mention of this specific behavior. Can anyone explain why parse_qsl returns a space in this case, and if it is expected behavior defined somewhere? Or is this a bug in the implementation?

Any insight would be appreciated. Please let me know if you need any other context or details from me. I want to understand this parsing behavior better.
The end of the parse_qsl function is using replace('+', ' ')
def parse_qsl(qs, keep_blank_values=False, strict_parsing=False,
              encoding='utf-8', errors='replace', max_num_fields=None, separator='&'):
    ...
    ...
    for name_value in query_args:
        ...
        ...
        if len(nv[1]) or keep_blank_values:
            name = nv[0].replace('+', ' ')
            name = unquote(name, encoding=encoding, errors=errors)
            name = _coerce_result(name)
            value = nv[1].replace('+', ' ')
            value = unquote(value, encoding=encoding, errors=errors)
            value = _coerce_result(value)
            r.append((name, value))
    return r
(Dec-28-2023, 11:10 AM)Yoriz Wrote: [ -> ]The end of the parse_qsl function is using replace('+', ' ')
def parse_qsl(qs, keep_blank_values=False, strict_parsing=False,
              encoding='utf-8', errors='replace', max_num_fields=None, separator='&'):
    ...
    ...
    for name_value in query_args:
        ...
        ...
        if len(nv[1]) or keep_blank_values:
            name = nv[0].replace('+', ' ')
            name = unquote(name, encoding=encoding, errors=errors)
            name = _coerce_result(name)
            value = nv[1].replace('+', ' ')
            value = unquote(value, encoding=encoding, errors=errors)
            value = _coerce_result(value)
            r.append((name, value))
    return r

Thank you for your answer. You explained why parse_qsl would replace '+' with ' '. But I'm more curious about whether such seemingly meaningless query params should be directly ignored instead of being processed? Because for something like username= without a value, parse_qsl would directly ignore it.