Posts: 27
Threads: 2
Joined: Jan 2019
BURAN!
Still trying to get my head around it, but that worked! Excecpt for the print(data.items()), the program complained that data had no attribute 'items'. I just removed the '.items()' and it printed, though probably not what you had wanted me to see.
My next step is going to be to expand all columns in the worksheets, and then maybe try to get fancy (ALL of this is fancy for me!). I think I'll leave the sheet named 'Sheet' in there and then rename it and create a little chart showing the count of records (minus the header) for each subgroup.
Thank you for your help on this. I'm learning a lot and am looking forward to learning more. I really appreciate your help.
Joe
Posts: 8,154
Threads: 160
Joined: Sep 2016
Jan-15-2019, 02:21 PM
(This post was last modified: Jan-15-2019, 07:20 PM by buran.)
(Jan-15-2019, 02:14 PM)Netopia Wrote: Excecpt for the print(data.items()), the program complained that data had no attribute 'items'. I just removed the '.items()' and it printed, though probably not what you had wanted me to see.
yep, my mistake - I mean to print(groups)
Posts: 27
Threads: 2
Joined: Jan 2019
A little further each day... but always needing a bit of hand holding.
I decided that it would be a good thing to be able to have the columns be the width of the data. So, I to google around and see how. I found some code and worked it in, but it doesn't seem to work to set the column width, but I can't figure out why. I print out the various bits of data as the program runs, and it all seems to be accurate. But is never gets applied... but also doesn't give any error!
Same with me trying to set the font as a fixed width font, so that I can know the width is sufficient. I've put in the code, but nothing happens and no errors.
Pointing out errors would be a blessing for anyone willing to look!
Thank you in advance... little by little, I'm learning something every day!
import os, glob, shutil, openpyxl
from pathlib import Path
from collections import defaultdict
from openpyxl.reader.excel import load_workbook
from openpyxl import Workbook
from openpyxl.compat import range
from openpyxl.utils import get_column_letter
from openpyxl.styles import Font
from datetime import datetime
startTime = datetime.now()
os.chdir("c://test")
mainfile = 'Example.xlsx'
wb = load_workbook(mainfile)
ws = wb.active
header, *data = ws.values
groups = defaultdict(lambda: defaultdict(list))
for row in data:
subgroup, group = row[-1][:-1], row[-1][-1]
groups[group][subgroup].append(row)
for group, subgroups in sorted(groups.items()):
wb = Workbook()
wsf =Font(name = 'Courier New', sz = 12)
wb.remove(wb.active)
dest_filename = (f'{mainfile[:-5]} Group {group}.xlsx')
for subgroup, rows in sorted(subgroups.items()):
ws = wb.create_sheet(title = subgroup)
ws.append(header)
for row in rows:
ws.append(row)
wb.save(filename = dest_filename)
#print(datetime.now() - startTime)
dims = {}
for row in ws.rows:
for cell in row:
if cell.value:
cell.font = wsf
dims[cell.column] = max((dims.get(cell.column, 0), len(str(cell.value))))
for col, value in dims.items():
ws.column_dimensions[col].width = int(value)
print(int(value))
print (cell, col, value)
print(group,subgroup, dims)
wb.save(filename = dest_filename)
print(datetime.now() - startTime)
Posts: 8,154
Threads: 160
Joined: Sep 2016
Jan-16-2019, 07:04 AM
(This post was last modified: Jan-17-2019, 05:16 PM by buran.)
The problem is you work only on last sheet in the last workbook. Your code must be executed for each worksheet in each workbook. For example make it part of loop starting at line 35. You may want to make it a function - that will be next step in better organizing your code. And hopefully will allow you to see that it can be simplified.
All that said you have to learn how to debug. Although it was pretty obvious this time, you may want to check the link in my signature for some general hints.
Posts: 27
Threads: 2
Joined: Jan 2019
Well buran, thanking you is getting to be a daily activity! Sure enough, I looked at the last sheet of the last workbook and there was the font change... but I would not have thought to look through each sheet of each book!
I will read the info at your link. Thank you for the hint forwards!
Joe
Posts: 8,154
Threads: 160
Joined: Sep 2016
(Jan-16-2019, 03:29 PM)Netopia Wrote: but I would not have thought to look through each sheet of each book! Maybe not yet, but after some time you will be able to see it by just looking at the code
Posts: 27
Threads: 2
Joined: Jan 2019
buran,
Per your suggestions and the debug link, I moved all the way back to just making changes to the full data set and just saving it as a different file. Here's what I have so far, and it's working. I'll continue to try to move forward piece by piece. Thank you for all your help and patience.
Joe
import os, glob, shutil, openpyxl
from pathlib import Path
from collections import defaultdict
from openpyxl.reader.excel import load_workbook
from openpyxl import Workbook
from openpyxl.compat import range
from openpyxl.utils import get_column_letter
from openpyxl.styles import Font
from datetime import datetime
startTime = datetime.now()
os.chdir("c://test")
mainfile = 'Example.xlsx'
wb = load_workbook(mainfile)
ws = wb.active
dims = {}
for row in ws.rows:
for cell in row:
cell.font = Font(name = 'Courier New', sz = 12)
if cell.value:
dims[cell.column] = max((dims.get(cell.column, 0), len(str(cell.value))))
for col, value in dims.items():
#print(col, value)
ws.column_dimensions[col].width = int(value * 1.3)
wb.save(filename = "output.xlsx")
print(datetime.now() - startTime)
Posts: 8,154
Threads: 160
Joined: Sep 2016
Hm, at the moment you change the column width of the original file and just save it under different name. I didn't suggest anything like this. My suggestion was to put the code that change the column width within the loop where you write data.
Posts: 27
Threads: 2
Joined: Jan 2019
Jan-17-2019, 05:33 PM
(This post was last modified: Jan-17-2019, 05:34 PM by Netopia.)
I took the general thought from this in your debugging link:
Quote:Once your program compiles cleanly and the duck doesn’t raise any major objections, if there’s still a bug then see if you can break your code up into smaller methods, each of which does exactly one logical operation. A common error amongst all programmers, not just beginners, is to make methods that try to do multiple things and do them poorly. Smaller methods are easier to understand and therefore easier for both you and the duck to see the bugs.
I figured that I would go back and make sure that I understood what was happening at each step, doing the simplest thing and then moving forward. So... I just changed the original file with the changes I wanted and then saved it to a different filename.
Now that I have that working and understand what it's really doing, I'll move on to trying to just get the groups to populate in new workbooks.
When I get that down (and understand it) I'll try to do the subgroups into different worksheets. I realized when reading that debug link was that one of my main problems was tht I didn't really understand the code I was working with.
No, you didn't suggest I do this, but you pointed to the debugging link, which gave me a better understanding of what I should be doing so that I can debug better.
Have I made a wrong choice?
Posts: 8,154
Threads: 160
Joined: Sep 2016
No, I just misunderstood what you said in the previous post. That's a good approach - break it in small steps and make sure each step works as expected and you understand it. That's a natural step to properly structure your program is small building blocks that you can reuse - e.g. functions, classes etc.
|