Python Forum
Regex Subdomain Validation & Input website manual - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Regex Subdomain Validation & Input website manual (/thread-12858.html)



Regex Subdomain Validation & Input website manual - rtzki - Sep-16-2018

Hello,
I have some code like this

import re, urllib

GRUBER_URLINTEXT_PAT = re.compile("(https?://)([^:^/]*)(:\\d*)?(.*)?")

for line in urllib.urlopen("https://pastebin.com/raw/hvGXKp72").readlines():
    print [ str(mgroups[1]).replace('\r\n','') for mgroups in GRUBER_URLINTEXT_PAT.findall(line) ]
this code to read
example.com
only without HTTP,HTTPS & WWW
Now i have a question , how to valid the Subdomain ? e.g
subdomain.example.com
is readable with the code?
and how to input manual the link website from
("https://pastebin.com/raw/hvGXKp72")
? e.g
Please Input Your Website :
then input the website manually.

Thank you in advace,
I hope anybody can help me.

Sorry for my bad english Angel Tongue


RE: Regex Subdomain Validation & Input website manual - ichabod801 - Sep-16-2018

Well, re.compile("(https?://)?([^:^/]*)(:\\d*)?(.*)?") will only catch the http(s) if it is there, and will match example.com. It will also match subdomain.example.com, but all of that will be in the second group. Is that what you wanted or did you want the subdomain to be in a separate group?

As for asking for user input, that's easy:

url = input('Please input your website: ')
match = GRUBER_URLINTEXT_PAT.match(url)
if match is None:
    print('Invalid url.')
else:
    print('That url is valid.')