Python Forum
extracting x from '(x)' as an integer number
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
extracting x from '(x)' as an integer number
#1
I have a file and I figured out how to extract and strip the line per line data I need from it but...I end up with the numbers representing individual colors like this:

White (2), 150, 151, 310, 319, 333, 340, 341, 600, 602, 603, 640, 645, 648, 680, 745, 754, 758, 779, 819, 829, 838, 948, 991, 3042, 3072, 3743, 3778, 3812, 3821

I need to add up how many color numbers there are and I am having trouble with the ones with (x) part. Each number represents 1 of a color except if it has a (x) where the number is x instead of 1.

I have been programming since 1976 in BASIC, Qbasic, Visbasic, MASM and perl but have been on hiatus since 2002 so new to Python.

Any help is appreciated.

Thank you,
hwolff1962
Reply
#2
well poop, found a simple solution

new_soup6 = re.sub('[^A-Za-z0-9]+', '', new_soup5)
Shy
Reply
#3
(Mar-22-2023, 02:42 AM)Hwolff1962 Wrote: White (2), 150, 151, 310, 319, 333, 340, 341, 600...
I understand that you want to transform this data into something else, but I'm not sure of the exact result that you want. Can you describe this?
Reply
#4
The Data lines reading is like "Floss Names included: White(2),310,320,321,322(4),500,900"

This is what I ended up with, NOTE, I just started learning Python 2 days ago so not very elegant:
############################################################
import re
from io import StringIO
from html.parser import HTMLParser

class MLStripper(HTMLParser):
    def __init__(self):
        super().__init__()
        self.reset()
        self.strict = False
        self.convert_charrefs= True
        self.text = StringIO()
    def handle_data(self, d):
        self.text.write(d)
    def get_data(self):
        return self.text.getvalue()

def strip_tags(html):
    s = MLStripper()
    s.feed(html)
    return s.get_data()

def exceptions(ex_str):
    x = ex_str.find('(')
    if x != -1:
        # FIND ( AND STRIP IT AND THE TRAILING )
        ex_str = ex_str.replace(")", "")
        # CONVERT TO INT AND RETURN IT
        return int(ex_str[x+1:12])

############################  MY STUFF

url='C:/Users/hwolf/Documents/Python/mira packs.txt'

with open(url,'r') as file1:
    while True:
        # FIRST LINE HAS NAME INFO IN IT
        soup1 = (file1.readline())
        # SECOND LINE HAS THE FLOSS NUMBERS IN IT
        soup2 = (file1.readline())
 
        m_name = soup1.split(',')

        new_soup1 = strip_tags(soup2)
        #STRIP LEFT TEXT
        new_soup2 = new_soup1[new_soup1.rfind("included:")+10:]
        # STRIP SPACES
        new_soup3 = new_soup2.replace(" ", "")
        # STRIP TRAILING " AND NEWLINE 
        new_soup3 = new_soup3[:-2]

        colors = 0
        # SPLIST LIST INTO INDIVIDUAL LINES
        
        for l in new_soup3.split(','):
            num = 0
            # HANDLE (x) IF FOUND
            if re.search(re.escape("("),l):
                # LINE CONTAINS A BRACKET SO HANDLE THAT
                colors = colors + exceptions(l)
            else:
                colors = colors + 1
            
        print(m_name[0],colors)
Gribouillis write Mar-23-2023, 04:37 PM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Extracting Version Number from a String britesc 2 1,030 May-31-2023, 10:20 AM
Last Post: britesc
  Factorial Code is not working when the given number is very long integer Raj_Kumar 2 2,258 Mar-31-2020, 06:40 PM
Last Post: deanhystad
  check if the number is a prime integer atlass218 5 2,878 Sep-26-2019, 07:58 AM
Last Post: atlass218
  Error when entering letter/character instead of number/integer helplessnoobb 2 6,983 Jun-22-2019, 07:15 AM
Last Post: ThomasL

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020