Python Forum

Full Version: re.search Q
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
Hi,

I'm trying to extract info from a web form (request) but I get the error shown below.
I know group() exists so, I cannot understand the error.
TIA

import re
request = 'POST /configure HTTP/1.1\r\nHost: 192.168.4.1\r\nOrigin: http://192.168.4.1\r\nContent-Type: application/x-www-form-urlencoded\r\nAccept-Encoding: gzip, deflate\r\nConnection: keep-alive\r\nUpgrade-Insecure-Requests: 1\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nUser-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 15_1_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.1 Mobile/15E148 Safari/604.1\r\nReferer: http://192.168.4.1/\r\nContent-Length: 108\r\nAccept-Language: en-GB,en;q=0.9\r\n\r\nssid=my_ssid&password=my_pass&token=&ip=192.168.1.222&gw=192.168.1.1&sbnet=255.255.255.0&dns=8.8.8.8'
match = re.search("ssid=([^&]*)&password=(.*)&token=(.*)&ip=(.*)&sbnet=(.*)&gw=(.*)&dns=(.*)", request)
ip = match.group(4)
print('ip', ip)
Output:
Traceback (most recent call last): File "C:\SharedFiles\Python\practice\test.py", line 4, in <module> ip = match.group(4) AttributeError: 'NoneType' object has no attribute 'group'
re.search() return None when no match is found.
(Nov-30-2021, 11:19 AM)Gribouillis Wrote: [ -> ]re.search() return None when no match is found.
I understand but there's a match. The variable ip is within the response's string (ip=192.168.1.222).
I am not sure how you come up with this string (and I suspect there is flaw in your approach), but

from urllib.parse import parse_qs
request = 'POST /configure HTTP/1.1\r\nHost: 192.168.4.1\r\nOrigin: http://192.168.4.1\r\nContent-Type: application/x-www-form-urlencoded\r\nAccept-Encoding: gzip, deflate\r\nConnection: keep-alive\r\nUpgrade-Insecure-Requests: 1\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nUser-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 15_1_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.1 Mobile/15E148 Safari/604.1\r\nReferer: http://192.168.4.1/\r\nContent-Length: 108\r\nAccept-Language: en-GB,en;q=0.9\r\n\r\nssid=my_ssid&password=my_pass&token=&ip=192.168.1.222&gw=192.168.1.1&sbnet=255.255.255.0&dns=8.8.8.8'
print(parse_qs(request.splitlines()[-1]))
output

Output:
{'ssid': ['my_ssid'], 'password': ['my_pass'], 'ip': ['192.168.1.222'], 'gw': ['192.168.1.1'], 'sbnet': ['255.255.255.0'], 'dns': ['8.8.8.8']}
Thanks, but I'm still struggling with my code and don't have other choice but to use 're.search()'.

Here's an updated code which, when the variables are searched individually, it works but not when searched as a group.

Obviously, I'm not using the wrong identifiers.
import re

ssid = ""
password = ""
token = ""
ip = ""
gw = ""
sbnet = ""
dns = ""

x = 'POST /configure HTTP/1.1\r\nHost: 192.168.4.1\r\nOrigin: http://192.168.4.1\r\nContent-Type: ' \
    'application/x-www-form-urlencoded\r\nAccept-Encoding: gzip, deflate\r\nConnection: ' \
    'keep-alive\r\nUpgrade-Insecure-Requests: 1\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,' \
    '*/*;q=0.8\r\nUser-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 15_1_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, ' \
    'like Gecko) Version/15.1 Mobile/15E148 Safari/604.1\r\nReferer: http://192.168.4.1/\r\nContent-Length: ' \
    '111\r\nAccept-Language: en-GB,en;q=0.9\r\n\r\nssid=my_ssid&password=my_pass&token=abctttttttttt123456789 ' \
    '&ip=192.168.1.222&gw=192.168.1.1&sbnet=255.255.255.0&dns=8.8.8.8 '

match = re.search("ssid=([^&]*)", x)
print(match)
match = re.search("password=([^&]*)", x)
print(match)
match = re.search("token=([^&]*)", x)
print(match)
match = re.search("ip=([^&]*)", x)
print(match)
match = re.search("gw=([^&]*)", x)
print(match)
match = re.search("sbnet=([^&]*)", x)
print(match)
match = re.search("dns=([^&]*)", x)
print(match)
match = re.search("ssid=([^&]*)&password=(.*)&token=([^&].*)&ip=([^&].*)&sbnet=([^&].*)&gw=([^&].*)&dns=([^&].*)", x)
print(match)

# try:
#     ssid = match.group(1).decode("utf-8").replace("%3F", "?").replace("%21", "!").replace("%40", "@").replace("+", " ")
#     password = match.group(2).decode("utf-8").replace("%3F", "?").replace("%21", "!").replace("%40", "@")
#     token = match.group(3).decode("utf-8").replace("%3F", "?").replace("%21", "!").replace("%40", "@")
#     ip = match.group(4).decode("utf-8").replace("%3F", "?").replace("%21", "!").replace("%40", "@")
#     sbnet = match.group(5).decode("utf-8").replace("%3F", "?").replace("%21", "!").replace("%40", "@")
#     gw = match.group(6).decode("utf-8").replace("%3F", "?").replace("%21", "!").replace("%40", "@")
#     dns = match.group(7).decode("utf-8").replace("%3F", "?").replace("%21", "!").replace("%40", "@")
# except Exception:
#     ssid = match.group(1).replace("%3F", "?").replace("%21", "!").replace("%40", "@").replace("+", " ")
#     password = match.group(2).replace("%3F", "?").replace("%21", "!").replace("%40", "@")
#     token = match.group(3).replace("%3F", "?").replace("%21", "!").replace("%40", "@")
#     ip = match.group(4).replace("utf-8").replace("%3F", "?").replace("%21", "!").replace("%40", "@")
#     sbnet = match.group(5).replace("utf-8").replace("%3F", "?").replace("%21", "!").replace("%40", "@")
#     gw = match.group(6).replace("utf-8").replace("%3F", "?").replace("%21", "!").replace("%40", "@")
#     dns = match.group(7).replace("utf-8").replace("%3F", "?").replace("%21", "!").replace("%40", "@")

print(ssid, password, token, ip, gw, sbnet, dns)
The entire regex has to match. Your regex has the string "sbnet=(.*)&gw=(.*)", but that isn't in your request string. Instead the request string has them in the opposite order: "...gw=192.168.1.1&sbnet=255.255.255.0..."

Since the order matters, the regex fails to match.
(Nov-30-2021, 04:37 PM)bowlofred Wrote: [ -> ]The entire regex has to match. Your regex has the string "sbnet=(.*)&gw=(.*)", but that isn't in your request string. Instead the request string has them in the opposite order: "...gw=192.168.1.1&sbnet=255.255.255.0..."

Since the order matters, the regex fails to match.
Thaks much, I didn't know a string search had to follow an order.

Anyhow, that fixed the 'None' issue but limits the search to x characters. Why is that?
match = re.search("ssid=([^]*)&password=(.*)&token=([^&].*)&ip=(.*)&gw=(.*)&sbnet=(.*)&dns=(.*)", x)
print(match)
(Nov-30-2021, 04:43 PM)ebolisa Wrote: [ -> ]Anyhow, that fixed the 'None' issue but limits the search to x characters. Why is that?

What do you mean by that?

import re
request = 'POST /configure HTTP/1.1\r\nHost: 192.168.4.1\r\nOrigin: http://192.168.4.1\r\nContent-Type: application/x-www-form-urlencoded\r\nAccept-Encoding: gzip, deflate\r\nConnection: keep-alive\r\nUpgrade-Insecure-Requests: 1\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nUser-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 15_1_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.1 Mobile/15E148 Safari/604.1\r\nReferer: http://192.168.4.1/\r\nContent-Length: 108\r\nAccept-Language: en-GB,en;q=0.9\r\n\r\nssid=my_ssid&password=my_pass&token=&ip=192.168.1.222&gw=192.168.1.1&sbnet=255.255.255.0&dns=8.8.8.8'
match = re.search("ssid=([^]*)&password=(.*)&token=([^&].*)&ip=(.*)&gw=(.*)&sbnet=(.*)&dns=(.*)", request)
ip = match.group(4)
print('ip', ip)
Output:
ip 255.255.255.0
(Nov-30-2021, 05:33 PM)bowlofred Wrote: [ -> ]
import re
request = 'POST /configure HTTP/1.1\r\nHost: 192.168.4.1\r\nOrigin: http://192.168.4.1\r\nContent-Type: application/x-www-form-urlencoded\r\nAccept-Encoding: gzip, deflate\r\nConnection: keep-alive\r\nUpgrade-Insecure-Requests: 1\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nUser-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 15_1_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.1 Mobile/15E148 Safari/604.1\r\nReferer: http://192.168.4.1/\r\nContent-Length: 108\r\nAccept-Language: en-GB,en;q=0.9\r\n\r\nssid=my_ssid&password=my_pass&token=&ip=192.168.1.222&gw=192.168.1.1&sbnet=255.255.255.0&dns=8.8.8.8'
match = re.search("ssid=([^]*)&password=(.*)&token=([^&].*)&ip=(.*)&gw=(.*)&sbnet=(.*)&dns=(.*)", request)
ip = match.group(4)
print('ip', ip)

thanks but the groups still are not in order Sad
print(match.group(1)) # ssid password and token are not correct
print(match.group(2)) # ip is correct
print(match.group(3)) # gw is correct
print(match.group(4)) # sbnet is correct
print(match.group(5)) # dns is correct
Output:
my_ssid&password=my_pass&token= 192.168.1.222 192.168.1.1 255.255.255.0 8.8.8.8
I suggest this ersatz
>>> {m.group(1): m.group(2) for m in re.finditer(r'([a-z]+)\=([^=&\n]*)', request)}
{'q': '0.9\r', 'ssid': 'my_ssid', 'password': 'my_pass', 'token': '', 'ip': '192.168.1.222', 'gw': '192.168.1.1', 'sbnet': '255.255.255.0', 'dns': '8.8.8.8'}
Pages: 1 2