How to tabulate correctly repeated blocks?

Xiesxes · Mar-20-2020, 07:35 PM

Hello to all,

Maybe someone could help me with this:

I have this file, for which I want to tabulate its values. Keys from a to c begin a new sequence of values (a block) and these 3 keys are always present. After keys a, b and could come values d to g.

SOME TEXT SOME TEXT
SOME TEXT SOME TEXT
SOME TEXT SOME TEXT SOME TEXT
SOME TEXT

                         a = 1
                         b = 5
                         c = 3

                         d = 0
                         e = 0

                         d = 4
                         e = 1
                         g = 1


blah blah

blah blah

///   FINISH


                         a = 3
                         b = 2
                         c = 8
                         d = 6
                         e = 9
                         f = 3



blah blah

blah blah

///   FINISH


                         a = 7
                         b = 2
                         c = 2
                         d = 9
                         e = 0

                         d = 1
                         e = 4

                         d = 7
                         e = 0
                         f = 1

                         d = 1
                         g = 8


blah blah

blah blah

///   FINISH

My goal is to tabulate it like image below using the list structure Pandas needs:
[Image: table.jpg?raw=1]

I'm currently able to store the file content in a list (lst) and then I try to group-by that list, getting this output(m2):

import re, pprint
from collections import defaultdict

file = 'file.txt'
f=open(file,"r").read().splitlines()

lst=[]
for line in f:
    if re.match(r'[ \t]', line):
        lst.append(line.replace(' ', '').split('='))

print(lst)


m2 = defaultdict(list)
for k, v in lst:
    m2[k].append(v)

>>> pprint.pprint(m2)
defaultdict(<class 'list'>,
            {'a': [1, 3, 7],
             'b': [5, 2, 2],
             'c': [3, 8, 2],
             'd': [0, 4, 6, 9, 1, 7, 1],
             'e': [0, 1, 9, 0, 4, 0],
             'f': [3, 1],
             'g': [1, 8]})

My issue is that the correct input(m2) to feed Pandas dataframe would be like this:

m2 = {
  'a': [1,1,3,7,7,7,7], 
  'b': [5,5,2,2,2,2,2],
  'c': [3,3,8,2,2,2,2],
  'd': [0,4,6,9,1,7,1],
  'e': [0,1,9,0,4,0,''],
  'f': ['','',3,'','',1,''],
  'g': [1,'','','','','',8],
 }

That needs a kind of fill down(only for keys a, b, c) and fill with blanks(for keys d to g) when needed.

Already asked on SO but no answers.

**Gribouillis** · Mar-20-2020, 10:37 PM

You can manually fill an array similar to the above image

letters = 'abcdefg'
keyindex = {x: i for i, x in enumerate(letters)}

table = []
ncol = len(keyindex)
row = [''] * ncol
cur = ncol

for k, v in lst:
    i = keyindex[k]
    if i < cur:
        row = row[:i]
        row.extend('' for w in range(i, ncol))
        table.append(row)
    row[i] = v
    cur = i + 1
    
m2 = dict(zip(letters, zip(*table)))

Xiesxes · Mar-21-2020, 12:28 AM

Hi @Gribouillis, Thanks so much for your help. It seems to work.

I'll study your code to understand your logic, but for now one question.

In actual file, the keys from list "lst" have more 2 or more letters.

for example, "a" would be "ADC", b would "MG", so in this case the string variable "letters" would be

letters = "ADCMGcdefg"

And it seems will get the correct output.

Would be good idea if instead to use letters as string variable, use letters as list?

If yes, how would change your code?

Thanks again!

**Gribouillis** · Mar-21-2020, 07:59 AM

Try with a tuple or a list

letters =  ("ADC", "MG", "c", "d", "e", "f", "g")

By the way, how did you create the image in the first post?

Xiesxes · (This post was last modified: Mar-21-2020, 04:57 PM by Xiesxes.)

(Mar-21-2020, 07:59 AM)Gribouillis Wrote: Try with a tuple or a list
letters =  ("ADC", "MG", "c", "d", "e", "f", "g")
By the way, how did you create the image in the first post?

Thanks so much for the help.

Regarding the image, I tabulated manually, set colors and arrows and all in MS Excel and finally took the screenshot, in order to try to explain better my goal.

Regards

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Converting Pandas DataFrame to a table of hourly blocks	Abedin	1	766	Apr-24-2025, 01:05 PM Last Post: snippsat
	RSA Cipher with blocks	Paragoon2	0	1,293	Nov-26-2023, 04:35 PM Last Post: Paragoon2
	How to insert Dashed Lines in between Rows of a tabulate output	Mudassir1987	0	1,252	Sep-27-2023, 10:09 AM Last Post: Mudassir1987
	How to properly format rows and columns in excel data from parsed .txt blocks	jh67	7	4,165	Dec-12-2022, 08:22 PM Last Post: jh67
	pandas, tabulate, and alignment	menator01	3	11,858	Feb-05-2022, 07:04 AM Last Post: menator01
	Am I a retard - else and finally blocks in a try statement	RubenF85	6	3,969	Jan-12-2021, 05:56 PM Last Post: bowlofred
	display the result of Dataframe in tabulate format	alex80	0	1,895	Sep-09-2020, 02:22 PM Last Post: alex80
	send repeated messages with apscheduler	pylab	1	2,854	Jan-04-2020, 08:43 PM Last Post: snippsat
	try/except blocks	newbieAuggie2019	11	7,323	Oct-05-2019, 05:55 PM Last Post: newbieAuggie2019
	Understanding program blocks	newbieAuggie2019	2	2,890	Oct-02-2019, 06:22 PM Last Post: newbieAuggie2019

How to tabulate correctly repeated blocks?

User Panel Messages

Announcements