I have a file and I figured out how to extract and strip the line per line data I need from it but...I end up with the numbers representing individual colors like this:
White (2), 150, 151, 310, 319, 333, 340, 341, 600, 602, 603, 640, 645, 648, 680, 745, 754, 758, 779, 819, 829, 838, 948, 991, 3042, 3072, 3743, 3778, 3812, 3821
I need to add up how many color numbers there are and I am having trouble with the ones with (x) part. Each number represents 1 of a color except if it has a (x) where the number is x instead of 1.
I have been programming since 1976 in BASIC, Qbasic, Visbasic, MASM and perl but have been on hiatus since 2002 so new to Python.
Any help is appreciated.
Thank you,
hwolff1962
well poop, found a simple solution
new_soup6 = re.sub('[^A-Za-z0-9]+', '', new_soup5)

(Mar-22-2023, 02:42 AM)Hwolff1962 Wrote: [ -> ]White (2), 150, 151, 310, 319, 333, 340, 341, 600...
I understand that you want to transform this data into something else, but I'm not sure of the exact result that you want. Can you describe this?
The Data lines reading is like "Floss Names included: White(2),310,320,321,322(4),500,900"
This is what I ended up with, NOTE, I just started learning Python 2 days ago so not very elegant:
############################################################
import re
from io import StringIO
from html.parser import HTMLParser
class MLStripper(HTMLParser):
def __init__(self):
super().__init__()
self.reset()
self.strict = False
self.convert_charrefs= True
self.text = StringIO()
def handle_data(self, d):
self.text.write(d)
def get_data(self):
return self.text.getvalue()
def strip_tags(html):
s = MLStripper()
s.feed(html)
return s.get_data()
def exceptions(ex_str):
x = ex_str.find('(')
if x != -1:
# FIND ( AND STRIP IT AND THE TRAILING )
ex_str = ex_str.replace(")", "")
# CONVERT TO INT AND RETURN IT
return int(ex_str[x+1:12])
############################ MY STUFF
url='C:/Users/hwolf/Documents/Python/mira packs.txt'
with open(url,'r') as file1:
while True:
# FIRST LINE HAS NAME INFO IN IT
soup1 = (file1.readline())
# SECOND LINE HAS THE FLOSS NUMBERS IN IT
soup2 = (file1.readline())
m_name = soup1.split(',')
new_soup1 = strip_tags(soup2)
#STRIP LEFT TEXT
new_soup2 = new_soup1[new_soup1.rfind("included:")+10:]
# STRIP SPACES
new_soup3 = new_soup2.replace(" ", "")
# STRIP TRAILING " AND NEWLINE
new_soup3 = new_soup3[:-2]
colors = 0
# SPLIST LIST INTO INDIVIDUAL LINES
for l in new_soup3.split(','):
num = 0
# HANDLE (x) IF FOUND
if re.search(re.escape("("),l):
# LINE CONTAINS A BRACKET SO HANDLE THAT
colors = colors + exceptions(l)
else:
colors = colors + 1
print(m_name[0],colors)