Python Forum

Full Version: extracting x from '(x)' as an integer number
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have a file and I figured out how to extract and strip the line per line data I need from it but...I end up with the numbers representing individual colors like this:

White (2), 150, 151, 310, 319, 333, 340, 341, 600, 602, 603, 640, 645, 648, 680, 745, 754, 758, 779, 819, 829, 838, 948, 991, 3042, 3072, 3743, 3778, 3812, 3821

I need to add up how many color numbers there are and I am having trouble with the ones with (x) part. Each number represents 1 of a color except if it has a (x) where the number is x instead of 1.

I have been programming since 1976 in BASIC, Qbasic, Visbasic, MASM and perl but have been on hiatus since 2002 so new to Python.

Any help is appreciated.

Thank you,
hwolff1962
well poop, found a simple solution

new_soup6 = re.sub('[^A-Za-z0-9]+', '', new_soup5)
Shy
(Mar-22-2023, 02:42 AM)Hwolff1962 Wrote: [ -> ]White (2), 150, 151, 310, 319, 333, 340, 341, 600...
I understand that you want to transform this data into something else, but I'm not sure of the exact result that you want. Can you describe this?
The Data lines reading is like "Floss Names included: White(2),310,320,321,322(4),500,900"

This is what I ended up with, NOTE, I just started learning Python 2 days ago so not very elegant:
############################################################
import re
from io import StringIO
from html.parser import HTMLParser

class MLStripper(HTMLParser):
    def __init__(self):
        super().__init__()
        self.reset()
        self.strict = False
        self.convert_charrefs= True
        self.text = StringIO()
    def handle_data(self, d):
        self.text.write(d)
    def get_data(self):
        return self.text.getvalue()

def strip_tags(html):
    s = MLStripper()
    s.feed(html)
    return s.get_data()

def exceptions(ex_str):
    x = ex_str.find('(')
    if x != -1:
        # FIND ( AND STRIP IT AND THE TRAILING )
        ex_str = ex_str.replace(")", "")
        # CONVERT TO INT AND RETURN IT
        return int(ex_str[x+1:12])

############################  MY STUFF

url='C:/Users/hwolf/Documents/Python/mira packs.txt'

with open(url,'r') as file1:
    while True:
        # FIRST LINE HAS NAME INFO IN IT
        soup1 = (file1.readline())
        # SECOND LINE HAS THE FLOSS NUMBERS IN IT
        soup2 = (file1.readline())
 
        m_name = soup1.split(',')

        new_soup1 = strip_tags(soup2)
        #STRIP LEFT TEXT
        new_soup2 = new_soup1[new_soup1.rfind("included:")+10:]
        # STRIP SPACES
        new_soup3 = new_soup2.replace(" ", "")
        # STRIP TRAILING " AND NEWLINE 
        new_soup3 = new_soup3[:-2]

        colors = 0
        # SPLIST LIST INTO INDIVIDUAL LINES
        
        for l in new_soup3.split(','):
            num = 0
            # HANDLE (x) IF FOUND
            if re.search(re.escape("("),l):
                # LINE CONTAINS A BRACKET SO HANDLE THAT
                colors = colors + exceptions(l)
            else:
                colors = colors + 1
            
        print(m_name[0],colors)