Python Forum
Complete NEWB and openpyxl project
Thread Rating:
  • 6 Vote(s) - 3.5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Complete NEWB and openpyxl project
#21
Like almost always, the easiest way to understand this is to print data.
data is tuple of tuples. Each of these inner tuples is one row. Each element in one such tuple is the value in a single cell.
Tuple is an iterable, i.e. you can iterate over its elements one by one. That is why you can do for row in data:. In each iteration of the loop row will hold next sub-tuple.

Now, defaultdict(list) has effect on groups[group].append(row). Here is the docs for defaultdict. In this case, using list, it will have default value of empty list.
When we do groups[group].append(row) we append respective row to the value for key group (Note that group is variable, it will have some values like '1', '2', '3' or '4', depending what is the last char in the last element (row[-1][-1]) in the respective row (tuple). What will happen if we were not using defaultdict but ordinary dict and there was no such key in the dict? Yes, you will get KeyError. With the defaultdict in our case it just creates empty list, map it to key group and allow you to append row to that list, without raising an exception. Of course there is way to do something like this also with normal dict, but with defaultdict is more convenient.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#22
buran,

"If you can't explain it to a six year old, you don't understand it yourself.", Albert Einstein

I can see by your excellent explanations that you take this quote to heart! :)

Ok, so groups is a list. And group is a collection of lists based on and named after the last character of the last element.

So ultimately groups becomes a list of lists?

I think I'm starting to understand... a little. ;)
Reply
#23
(Jan-11-2019, 02:54 PM)Netopia Wrote: Ok, so groups is a list. And group is a collection of lists based on and named after the last character of the last element.

So ultimately groups becomes a list of lists?


I am afraid not. There are different data structures (containers) in python - these are basic ones, but there are also others, e.g. - in the collections module.

groups is a defaultdict - in simple terms that's a dict (dictionary) with default factor value - in our case - [empty] list.
dict (and also the defaultdict) is a mapping type, i.e. it has key:value pairs. keys are unique. in our case, with the defaultdict we have lists as values.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#24
OK, so I did some reading and I think I understand the block below, but need some clarification:

#print each group in desired order
for group, rows in sorted(groups.items()):
    print('\n\nGroup {}\n---------------\n'.format(group))
    for row in sorted(rows, key=lambda x: x[-1][:-1]):
        print(row)
I read some on 'lambda', and I get the key defined as X and x being the last element in rod, but I don't get the usages of the [:-1]

Joe
Reply
#25
lambda is the key function that is applied to each member of rows in order to sort rows. So x is a single row.
x[-1] is the last element in the row. python allows to use negative indexes.
[:-1] is slicing notation. it will return everything, by the last char of the last element

you can read https://docs.python.org/3/tutorial/intro...ml#strings


>>> x = (33011907.77726958, 'Vancito E.', 'Gumbs SR', 'Vancito E. Gumbs SR', 'ABCDE1')
>>> x[-1]
'ABCDE1'
>>> x[-1][:-1]
'ABCDE'
>>> 
Now when I think, given that last char is always the same for all elements in a given group, it can be just for row in sorted(rows, key=lambda x: x[-1]):
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#26
Your last thought answered my question. I thought that:

x[-1][:-1]
Was sorting by the last character of the last element of the row... except that the output was actually sorted by the complete element.

Thanks
Reply
#27
(Jan-14-2019, 08:24 PM)Netopia Wrote: Your last thought answered my question. I thought that:
x[-1][:-1]
Was sorting by the last character of the last element of the row... except that the output was actually sorted by the complete element. Thanks
No, x[-1][:-1] sorts by everything up to, but without last char of the element. And because the last char is constant for a group, it can be simplified to x[-1] (i.e. the whole last element). It's a small but important difference if the last char is variable one.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#28
Thank you for that explanation... I ended up needing to know that! :)

I've gotten stuck again. I'm managed to create spreadsheets named as a concatenation of the original file and the group. I've managed to take the "subgroup" (everything before the final digit in the reorder field) and create sheets named for them in each workbook. At least a small sense of accomplishment! I was able to reuse your code to figure out how to get where I am.

Now I can't seem to figure out how to append the header and then the rows of group data to the the sheets. Here's what I've got, but I get an output message that says:

Output:
RESTART: C:\Users\Joe\Dropbox\Python\Misc py files\group-sort examples from buran.py OPTION 1 output Traceback (most recent call last): File "C:\Users\Joe\Dropbox\Python\Misc py files\group-sort examples from buran.py", line 47, in <module> ws.cell(row=1, column=1).value = header File "C:\Python3x\lib\site-packages\openpyxl\cell\cell.py", line 294, in value self._bind_value(value) File "C:\Python3x\lib\site-packages\openpyxl\cell\cell.py", line 207, in _bind_value raise ValueError("Cannot convert {0!r} to Excel".format(value)) ValueError: Cannot convert ('Cust#', 'First Name', 'Last Name', 'Full Name', 'Reorder') to Excel
From this code:

import os, glob, shutil, openpyxl
from collections import defaultdict
from openpyxl.reader.excel import load_workbook
from openpyxl import Workbook
from openpyxl.compat import range
from openpyxl.utils import get_column_letter

os.chdir("c://test")
 
mainfile = 'Example.xlsx'
wb = load_workbook(mainfile)
ws = wb.active
 
# read header and all data in two variables -> header and data
# Note the asteriks in frot of data
# This way data will hold everything except the first element in ws.values
 
header, *data = ws.values
 
 
 
# OPTION 1
print('\n\nOPTION 1 output')
# parse data into groups
groups = defaultdict(list)
 
for row in data:
    group = row[-1][-1]
    groups[group].append(row)
 
###print each group in desired order
##for group, rows in sorted(groups.items()):
##    print('\n\nGroup {}\n---------------\n'.format(group))
##    for row in sorted(rows, key=lambda x: x[-1]):
##        print(row)


for group, rows in sorted(groups.items()):
    wb = Workbook()
    wb.remove(wb.active)
    dest_filename = (mainfile[:-5] + " Group " + group + ".xlsx")
    for row in sorted(rows, key=lambda x: x[-1][:-1]):
        subgroup = row[-1][:-1]
        wb.create_sheet(title = subgroup)
        ws.cell(row=1, column=1).value = header
       
    wb.save(filename = dest_filename)
Thoughts?
Reply
#29
Kept trying, but thus far to no avail. I think the real problem is that I'm not conceptualizing what's happening at each stage and being able to then convert that to proper code.

What I wrote below got rid of the error, but didn't write any data into the various workbooks/spreadsheets either (other than the sheet names). The print(header) does print the five field names a BUNCH of times, which I also didn't expect.

for group, rows in sorted(groups.items()):
    wb = Workbook()
    wb.remove(wb.active)
    dest_filename = (mainfile[:-5] + " Group " + group + ".xlsx")
    
    for row in sorted(rows, key=lambda x: x[-1][:-1]):
        subgroup = row[-1][:-1]
        wb.create_sheet(title = subgroup)
        for item in header:
            print(item)
            r=0
            ch=0   #column for the header record
            ws.cell(row=(r+1), column=(ch+1)).value = item
Reply
#30
you have to look at worksheet.append() method

Now, I didn't realise you will write subgroups in different sheets. So I've made a little change to groups. Not tested but

from collections import defaultdict
from openpyxl.reader.excel import load_workbook
from openpyxl import Workbook

os.chdir("c://test")
  
mainfile = 'Example.xlsx'
wb = load_workbook(mainfile)
ws = wb.active
  
# read header and all data in two variables -> header and data
# Note the asteriks in frot of data
# This way data will hold everything except the first element in ws.values
  
header, *data = ws.values
  

# parse data into groups
# creating defaultdict that has default factory value defaultdict(list)
# effectively groups is dict of dicts of lists
groups = defaultdict(lambda: defaultdict(list))
  
for row in data:
    subgroup, group = row[-1][:-1], row[-1][-1]
    groups[group][subgroup].append(row)

# just to visualise the groups    
print(groups) # fixed it - it should be groups, not data

# write to worksheet
for group, subgroups in sorted(groups.items()):
    wb = Workbook()
    wb.remove(wb.active)
    dest_filename = (f'{mainfile[:-5]} Group {group}.xlsx')
    for subgroup, rows in sorted(subgroups.items()):
        ws = wb.create_sheet(title = subgroup)
        ws.append(header)
        for row in rows:
            ws.append(row)   
    wb.save(filename = dest_filename)
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python newb need help Fictile 1 210 Apr-02-2024, 03:28 AM
Last Post: buran
  NameError issue with daughter's newb code MrGonk 2 1,451 Sep-16-2021, 01:29 PM
Last Post: BashBedlam
  Simple newb string question Involute 2 2,210 Sep-08-2019, 12:50 AM
Last Post: Involute
  please help this newb install pygame iofhua 7 5,939 May-15-2019, 01:09 PM
Last Post: buran
  Newb question: Debugging + Linting Python in Visual Studio Code Drone4four 1 2,427 Apr-15-2019, 06:19 AM
Last Post: perfringo
  Newb question about %02d %04d bennylava 30 19,476 Mar-05-2019, 11:23 PM
Last Post: snippsat
  Pthyon 3 question (newb) bennylava 11 5,860 Feb-28-2019, 06:04 PM
Last Post: buran
  newb selfie PatM 5 3,611 Feb-19-2019, 12:20 AM
Last Post: snippsat
  Newb Question - Threading in Crons vvarrior 2 2,776 Jul-20-2018, 08:12 PM
Last Post: vvarrior
  Matt's newb question 1 MattSS102 1 2,703 Aug-28-2017, 03:27 AM
Last Post: BerlingSwe

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020