Python Forum

Full Version: Data Hygiene, classes and magic reappearance of data
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello there,

I am parsing an xml file for some data and want to store that in a class.
The data is in the form of 2 tuple floats and the class is supposed to hold a list of 2 tuples.


The class looks like this:
class TimeLabelData:
    labels = []
    def __init__(self,filename):
        self.name = filename

    def addLabelPair(self,start,stop):
        self.labels.append((float(start),float(stop)))

    def getData(self):
        return self.labels

    def getLength(self):
        if not self.labels: return 0
        return  max(max(self.labels,key=lambda x:x[1]))
    
    def getSilence(self):
        sound = 0
        for l in self.labels:
            sound+=l[1]-l[0]

        return self.getLength()-sound

    def flush(self):
        self.labels = []
        
Here is the function I use to go trough the xml :

from xml.etree import ElementTree
import TimeLabelDataModule

def xml_processing(file):
    """Runs trough the xml(.aup file) to get time labels and sum them """
    e = ElementTree.parse(file).getroot()
    for child in e:
        if child.tag == "{http://audacity.sourceforge.net/xml/}labeltrack": #found timelabel section
            t = TimeLabelDataModule.TimeLabelData(file)
            for label in child:
                t.addLabelPair(label.attrib['t'],label.attrib['t1'])
				
                print(label.attrib['t'],label.attrib['t1'])
            print(t.getData())
            print(t.getLength())
            print(t.getSilence())
			
            t.flush()
            del t
You might see the excessive try to get rid of the data in t, those are desperate tries to solve my problem but later more to that.

Now an example xml file :
So far so good. If I run this on one file no problemo. Output looks like this:
Output:
C:\Users\1337\Desktop\marie\PostProjects\01-AudioTrack.aup 0.5712300000 1.0100000000 2.1700000000 2.5900000000 2.8800000000 3.5900000000 3.7000000000 3.7100000000 4.4900000000 4.5200000000 5.1400000000 5.7100000000 5.9200000000 7.2300000000 7.4700000000 8.3100000000 8.7700000000 12.8600000000 13.0400000000 13.0500000000 13.2800000000 13.2900000000 13.4600000000 13.5100000000 13.7300000000 14.5400000000 14.8500000000 17.6000000000 17.7500000000 17.7500000000 [(0.57123, 1.01), (2.17, 2.59), (2.88, 3.59), (3.7, 3.71), (4.49, 4.52), (5.14, 5.71), (5.92, 7.23), (7.47, 8.31), (8.77, 12.86), (13.04, 13.05), (13.28, 13.29), (13.46, 13.51), (13.73, 14.54), (14.85, 17.6), (17.75, 17.75)] 17.75 5.701229999999999
Everything's fine.
But if I try to loop over multiple xml files somehow time data from one instance of TimeLabelData - even after I flush() or del the instance - "spills" into another.

Here a 2nd example xml file :

Now onto the output :

Output:
C:\Users\1337\Desktop\marie\PostProjects\01-AudioTrack.aup 0.5712300000 1.0100000000 2.1700000000 2.5900000000 2.8800000000 3.5900000000 3.7000000000 3.7100000000 4.4900000000 4.5200000000 5.1400000000 5.7100000000 5.9200000000 7.2300000000 7.4700000000 8.3100000000 8.7700000000 12.8600000000 13.0400000000 13.0500000000 13.2800000000 13.2900000000 13.4600000000 13.5100000000 13.7300000000 14.5400000000 14.8500000000 17.6000000000 17.7500000000 17.7500000000 [(0.57123, 1.01), (2.17, 2.59), (2.88, 3.59), (3.7, 3.71), (4.49, 4.52), (5.14, 5.71), (5.92, 7.23), (7.47, 8.31), (8.77, 12.86), (13.04, 13.05), (13.28, 13.29), (13.46, 13.51), (13.73, 14.54), (14.85, 17.6), (17.75, 17.75)] 17.75 5.701229999999999 C:\Users\1337\Desktop\marie\PostProjects\02-AudioTrack.aup 0.0600000000 4.4300000000 4.7400000000 6.3100000000 6.9900000000 7.4000000000 7.7300000000 10.4700000000 10.9500000000 11.0100000000 11.3800000000 13.7600000000 14.3100000000 14.3800000000 14.7400000000 17.6700000000 17.6700000000 17.6700000000 [(0.57123, 1.01), (2.17, 2.59), (2.88, 3.59), (3.7, 3.71), (4.49, 4.52), (5.14, 5.71), (5.92, 7.23), (7.47, 8.31), (8.77, 12.86), (13.04, 13.05), (13.28, 13.29), (13.46, 13.51), (13.73, 14.54), (14.85, 17.6), (17.75, 17.75), (0.06, 4.43), (4.74, 6.31), (6.99, 7.4), (7.73, 10.47), (10.95, 11.01), (11.38, 13.76), (14.31, 14.38), (14.74, 17.67), (17.67, 17.67)] 17.75 -8.828770000000006
As you can see parsing the 1st file works perfectly fine and the 2nd almost aswell. But the tuple (17.75, 17.75) appears in the 2nd t instance but doesnt in the xml file neither does the output suggest that it got added to said t instance.
It does exist in the 1st t instance/ xml file however.
So my assumption is that I somehow fail to clean up the t instance and my debugger says the same thing. But why does "del t" or "t.flush()" not work the way I expected it to ?
labels is a class attribute. You are not deleting the class, you are deleting the instance, and the data is sticking around in the class attribute. Make the attribute an instance attribute initialized to [] in __init__.
(Mar-02-2019, 04:02 PM)ichabod801 Wrote: [ -> ]labels is a class attribute. You are not deleting the class, you are deleting the instance, and the data is sticking around in the class attribute. Make the attribute an instance attribute initialized to [] in __init__.

Ty that solved it!