Python Forum
Data Hygiene, classes and magic reappearance of data
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Data Hygiene, classes and magic reappearance of data
#1
Hello there,

I am parsing an xml file for some data and want to store that in a class.
The data is in the form of 2 tuple floats and the class is supposed to hold a list of 2 tuples.


The class looks like this:
class TimeLabelData:
    labels = []
    def __init__(self,filename):
        self.name = filename

    def addLabelPair(self,start,stop):
        self.labels.append((float(start),float(stop)))

    def getData(self):
        return self.labels

    def getLength(self):
        if not self.labels: return 0
        return  max(max(self.labels,key=lambda x:x[1]))
    
    def getSilence(self):
        sound = 0
        for l in self.labels:
            sound+=l[1]-l[0]

        return self.getLength()-sound

    def flush(self):
        self.labels = []
        
Here is the function I use to go trough the xml :

from xml.etree import ElementTree
import TimeLabelDataModule

def xml_processing(file):
    """Runs trough the xml(.aup file) to get time labels and sum them """
    e = ElementTree.parse(file).getroot()
    for child in e:
        if child.tag == "{http://audacity.sourceforge.net/xml/}labeltrack": #found timelabel section
            t = TimeLabelDataModule.TimeLabelData(file)
            for label in child:
                t.addLabelPair(label.attrib['t'],label.attrib['t1'])
				
                print(label.attrib['t'],label.attrib['t1'])
            print(t.getData())
            print(t.getLength())
            print(t.getSilence())
			
            t.flush()
            del t
You might see the excessive try to get rid of the data in t, those are desperate tries to solve my problem but later more to that.

Now an example xml file :
So far so good. If I run this on one file no problemo. Output looks like this:
Output:
C:\Users\1337\Desktop\marie\PostProjects\01-AudioTrack.aup 0.5712300000 1.0100000000 2.1700000000 2.5900000000 2.8800000000 3.5900000000 3.7000000000 3.7100000000 4.4900000000 4.5200000000 5.1400000000 5.7100000000 5.9200000000 7.2300000000 7.4700000000 8.3100000000 8.7700000000 12.8600000000 13.0400000000 13.0500000000 13.2800000000 13.2900000000 13.4600000000 13.5100000000 13.7300000000 14.5400000000 14.8500000000 17.6000000000 17.7500000000 17.7500000000 [(0.57123, 1.01), (2.17, 2.59), (2.88, 3.59), (3.7, 3.71), (4.49, 4.52), (5.14, 5.71), (5.92, 7.23), (7.47, 8.31), (8.77, 12.86), (13.04, 13.05), (13.28, 13.29), (13.46, 13.51), (13.73, 14.54), (14.85, 17.6), (17.75, 17.75)] 17.75 5.701229999999999
Everything's fine.
But if I try to loop over multiple xml files somehow time data from one instance of TimeLabelData - even after I flush() or del the instance - "spills" into another.

Here a 2nd example xml file :

Now onto the output :

Output:
C:\Users\1337\Desktop\marie\PostProjects\01-AudioTrack.aup 0.5712300000 1.0100000000 2.1700000000 2.5900000000 2.8800000000 3.5900000000 3.7000000000 3.7100000000 4.4900000000 4.5200000000 5.1400000000 5.7100000000 5.9200000000 7.2300000000 7.4700000000 8.3100000000 8.7700000000 12.8600000000 13.0400000000 13.0500000000 13.2800000000 13.2900000000 13.4600000000 13.5100000000 13.7300000000 14.5400000000 14.8500000000 17.6000000000 17.7500000000 17.7500000000 [(0.57123, 1.01), (2.17, 2.59), (2.88, 3.59), (3.7, 3.71), (4.49, 4.52), (5.14, 5.71), (5.92, 7.23), (7.47, 8.31), (8.77, 12.86), (13.04, 13.05), (13.28, 13.29), (13.46, 13.51), (13.73, 14.54), (14.85, 17.6), (17.75, 17.75)] 17.75 5.701229999999999 C:\Users\1337\Desktop\marie\PostProjects\02-AudioTrack.aup 0.0600000000 4.4300000000 4.7400000000 6.3100000000 6.9900000000 7.4000000000 7.7300000000 10.4700000000 10.9500000000 11.0100000000 11.3800000000 13.7600000000 14.3100000000 14.3800000000 14.7400000000 17.6700000000 17.6700000000 17.6700000000 [(0.57123, 1.01), (2.17, 2.59), (2.88, 3.59), (3.7, 3.71), (4.49, 4.52), (5.14, 5.71), (5.92, 7.23), (7.47, 8.31), (8.77, 12.86), (13.04, 13.05), (13.28, 13.29), (13.46, 13.51), (13.73, 14.54), (14.85, 17.6), (17.75, 17.75), (0.06, 4.43), (4.74, 6.31), (6.99, 7.4), (7.73, 10.47), (10.95, 11.01), (11.38, 13.76), (14.31, 14.38), (14.74, 17.67), (17.67, 17.67)] 17.75 -8.828770000000006
As you can see parsing the 1st file works perfectly fine and the 2nd almost aswell. But the tuple (17.75, 17.75) appears in the 2nd t instance but doesnt in the xml file neither does the output suggest that it got added to said t instance.
It does exist in the 1st t instance/ xml file however.
So my assumption is that I somehow fail to clean up the t instance and my debugger says the same thing. But why does "del t" or "t.flush()" not work the way I expected it to ?
Reply
#2
labels is a class attribute. You are not deleting the class, you are deleting the instance, and the data is sticking around in the class attribute. Make the attribute an instance attribute initialized to [] in __init__.
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply
#3
(Mar-02-2019, 04:02 PM)ichabod801 Wrote: labels is a class attribute. You are not deleting the class, you are deleting the instance, and the data is sticking around in the class attribute. Make the attribute an instance attribute initialized to [] in __init__.

Ty that solved it!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Help with to check an Input list data with a data read from an external source sacharyya 3 317 Mar-09-2024, 12:33 PM
Last Post: Pedroski55
  Write sql data or CSV Data into parquet file mg24 2 2,355 Sep-26-2022, 08:21 AM
Last Post: ibreeden
  Load multiple Jason data in one Data Frame vijays3 6 1,499 Aug-12-2022, 05:17 PM
Last Post: vijays3
  Issue in changing data format (2 bytes) into a 16 bit data. GiggsB 11 2,559 Jul-25-2022, 03:19 PM
Last Post: deanhystad
  magic related field in Django model sonh 1 1,200 Apr-24-2022, 12:37 PM
Last Post: sonh
  Adding shifted data set to data set xquad 3 1,470 Dec-22-2021, 10:20 AM
Last Post: Larz60+
  Need a little help with numpy array magic. pmf71 0 1,126 Dec-01-2021, 02:51 AM
Last Post: pmf71
  Looking for data/info on a perticular data-proccesing problem. MvGulik 9 3,780 May-01-2021, 07:43 AM
Last Post: MvGulik
  How to filter out Column data From Multiple rows data? firaki12345 10 5,008 Feb-06-2021, 04:54 AM
Last Post: buran
  Magic Method Arithmetic Operators ClownPrinceOfCrime 3 2,280 Jan-10-2021, 03:24 PM
Last Post: ndc85430

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020