Python Forum
Refactoring code for a domain name list
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Refactoring code for a domain name list
#1
Hello,

I've been working today on this code. Basically, I have a list of spam domains I'd like to clean up and format so I may enter them in a list of BlockedDomains (php). I've found a list of bad domains, copied it from StackOverflow and then put in a txt file.

This below works, but I'd like to know if there's a better way of implementing this with the re library?


import re

blockedDomains = ("""
    •	tenorshare.com
    •	advancedpdfconverter.com
    •	androiddatarecoverypro.com
    •	any-data-recovery.com
    •	card-data-recovery.com
    """)

stripped = re.sub(r"\s+", "", blockedDomains)
spliced = stripped[1:-1]
pretty = re.sub(r"\•", "',\n'", spliced)

print ('\'' + pretty + '\'')
And this returns a nicely formatted list:
---
'tenorshare.com',
'advancedpdfconverter.com',
'androiddatarecoverypro.com',
'any-data-recovery.com',
'card-data-recovery.co' // no comma
---
Reply
#2
blockedDomains = [
    'tenorshare.com',
    'advancedpdfconverter.com',
    'androiddatarecoverypro.com',
    'any-data-recovery.com',
    'card-data-recovery.com'
]

for domain in blockedDomains:
    print(f"'{domain}'")
Reply
#3
Hi @Larz60

Sorry, I was not more clear.

The end result desired is the pretty list. The thing I came up agains was prepending or appending the string blockedDomains. I know they're immutable and tried using an empty list but that created a bunch of "".

So, that's why there' a few steps in the re.sub()
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020