Mar-20-2020, 07:35 PM
Hello to all,
Maybe someone could help me with this:
I have this file, for which I want to tabulate its values. Keys from a to c begin a new sequence of values (a block) and these 3 keys are always present. After keys a, b and could come values d to g.
![[Image: table.jpg?raw=1]](https://www.dropbox.com/s/iyqxwx7wdejufmi/table.jpg?raw=1)
I'm currently able to store the file content in a list (lst) and then I try to group-by that list, getting this output(m2):
Already asked on SO but no answers.
Maybe someone could help me with this:
I have this file, for which I want to tabulate its values. Keys from a to c begin a new sequence of values (a block) and these 3 keys are always present. After keys a, b and could come values d to g.
SOME TEXT SOME TEXT SOME TEXT SOME TEXT SOME TEXT SOME TEXT SOME TEXT SOME TEXT a = 1 b = 5 c = 3 d = 0 e = 0 d = 4 e = 1 g = 1 blah blah blah blah /// FINISH a = 3 b = 2 c = 8 d = 6 e = 9 f = 3 blah blah blah blah /// FINISH a = 7 b = 2 c = 2 d = 9 e = 0 d = 1 e = 4 d = 7 e = 0 f = 1 d = 1 g = 8 blah blah blah blah /// FINISHMy goal is to tabulate it like image below using the list structure Pandas needs:
![[Image: table.jpg?raw=1]](https://www.dropbox.com/s/iyqxwx7wdejufmi/table.jpg?raw=1)
I'm currently able to store the file content in a list (lst) and then I try to group-by that list, getting this output(m2):
import re, pprint from collections import defaultdict file = 'file.txt' f=open(file,"r").read().splitlines() lst=[] for line in f: if re.match(r'[ \t]', line): lst.append(line.replace(' ', '').split('=')) print(lst) m2 = defaultdict(list) for k, v in lst: m2[k].append(v) >>> pprint.pprint(m2) defaultdict(<class 'list'>, {'a': [1, 3, 7], 'b': [5, 2, 2], 'c': [3, 8, 2], 'd': [0, 4, 6, 9, 1, 7, 1], 'e': [0, 1, 9, 0, 4, 0], 'f': [3, 1], 'g': [1, 8]})My issue is that the correct input(m2) to feed Pandas dataframe would be like this:
m2 = { 'a': [1,1,3,7,7,7,7], 'b': [5,5,2,2,2,2,2], 'c': [3,3,8,2,2,2,2], 'd': [0,4,6,9,1,7,1], 'e': [0,1,9,0,4,0,''], 'f': ['','',3,'','',1,''], 'g': [1,'','','','','',8], }That needs a kind of fill down(only for keys a, b, c) and fill with blanks(for keys d to g) when needed.
Already asked on SO but no answers.