Python Forum
Complete NEWB and openpyxl project
Thread Rating:
  • 6 Vote(s) - 3.5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Complete NEWB and openpyxl project
#11
Note that reorder codes in these dummy data are unique, i.e. no repeat values. In the previous file there were repeated values. That may affect implementation
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#12
Netopia Wrote:I'm stuck trying to figure out how to go from this general list of tasks to actually doing it.
Start with simple tasks. For example start with the goal of printing all the records that belong to group1.
Reply
#13
(Jan-10-2019, 08:03 PM)buran Wrote: Note that reorder codes in these dummy data are unique, i.e. no repeat values. In the previous file there were repeated values. That may affect implementation

I'll add more records so there will be multiples. Thank you.
Reply
#14
Here is sample implementation (actually two, slightly different alternatives) how to read and sort the data. Hope it will kick-start you
Of course plenty can be improved.

from collections import defaultdict
from collections import namedtuple # used in OPTION 2
from openpyxl.reader.excel import load_workbook

filename = 'Example.xlsx'
wb = load_workbook(filename)
ws = wb.active

# read header and all data in two variables -> header and data
# Note the asteriks in frot of data
# This way data will hold everything except the first element in ws.values

header, *data = ws.values



# OPTION 1
print('\n\nOPTION 1 output')
# parse data into groups
groups = defaultdict(list)

for row in data:
    group = row[-1][-1]
    groups[group].append(row)

#print each group in desired order
for group, rows in sorted(groups.items()):
    print('\n\nGroup {}\n---------------\n'.format(group))
    for row in sorted(rows, key=lambda x: x[-1][:-1]):
        print(row)


# OPTION 2
print('\n\nOPTION 2 output')

# parse data into groups
groups = defaultdict(list)

# define namedtuple
fields = ('cust', 'first_name', 'last_name', 'full_name', 'reorder')
Record = namedtuple('Record', fields)

for row in data:
    # parse each row into Record namedtuple
    record = Record(*row)
    group = record.reorder[-1] # last char in the recorder property in the record
    groups[group].append(record)

#print each group in desired order
for group, records in sorted(groups.items()):
    print('\n\nGroup {}\n---------------\n'.format(group))
    for record in sorted(records, key=lambda rec: rec.reorder[:-1]):
        print(record)
OUTPUT:
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#15
Ok, I've modified it so that there are four of every Reorder value. Thank goodness for online random name generators! :) Not something I'd ever thought of googling before today.
Reply
#16
Gribouillis,

I had some success implementing the sample code you gave. I can now get a list of the data, with the len equaling the number of rows! Thank you!

buran
I'm going to go through your examples line by line and see what I can understand. I'll post back about anything I don't understand so that I can learn, if that's ok.

To all - what is the difference I see in some places between something like:

import openpyxl

and

from openpyxl import load_workbook

and

from openpyxl.reader.excel import load_workbook

??
Reply
#17
Both are the same. If you look at the docs or at the source code you will see that openpyxl package is organised in sub-packages. One of these sub-packages is openpyxl.reader. in sub-package openpyxl.reader there is module excel.py and load_workbook() function is part of this module.
So using from openpyxl.reader.excel import load_workbook is very explicit from where you import it.

At the same time the __init__.py file for top-level openpyxl package
imports the load_workbook() effectively exposing it and makes it possible to use from openpyxl import load_workbook or if you do just import openpyxl to use wb = openpyxl.load_workbook(some_file)

It's perfectly fine to use any of these - do as you prefer.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#18
Thank you for that explanation buran.

Already I'm learning! I puzzled over the line:

header, *data = ws.values
How did it know to pull the header out just by having a variable named "header"?

So, I played with:

this, that, those, *other = ws.values
And happily discovered that each variable = one row and that putting the * in front of the last one makes it skip rows for whatever number of other variables are already defined. Does this work for any collection of data, or just for openpyxl?

I'm having fun learning! Thank you guys for taking the time to help me. :)
Reply
#19
pay attention to comments in the code (and ignore the typing errors :-))
Quote:# read header and all data in two variables -> header and data
# Note the asteriks in frot of data
# This way data will hold everything except the first element in ws.values

and if you are interested to read further
PEP 3132 -- Extended Iterable Unpacking
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#20
(Jan-11-2019, 01:54 PM)buran Wrote: pay attention to comments in the code (and ignore the typing errors :-))
Quote:# read header and all data in two variables -> header and data
# Note the asteriks in frot of data
# This way data will hold everything except the first element in ws.values

and if you are interested to read further
PEP 3132 -- Extended Iterable Unpacking

I did read that. But even though you told me that, I didn't understand how it worked. How it 'knew' that header was the first row. I didn't know that having multiple variables corresponded to rows, one at a time. I didn't know that the * picked up from not just row 2, but from whatever number the last variable was. You pointed me in the right direction with your comments, and that helped me to learn more.

I promise to do my best to read whatever you have taken the time to write! :)

groups = defaultdict(list)

for row in data:
    group = row[-1][-1]
    groups[group].append(row)
I was wondering if the word 'row' had any significance other than being easy to understand. I replaced it temporarily with the word 'this', and everything still worked fine.

Do I understand then, that since you used (list) for defaultdict, that it created a list and so when you say 'for xxxx in yyy', python just grabs the largest unit size, in this case being an entire row? I really am new to this stuff, and this might be basic, but I'm trying to make sure that I don't assume something and then later be confused when something doesn't work the way I thought it did.

Joe
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python newb need help Fictile 1 220 Apr-02-2024, 03:28 AM
Last Post: buran
  NameError issue with daughter's newb code MrGonk 2 1,459 Sep-16-2021, 01:29 PM
Last Post: BashBedlam
  Simple newb string question Involute 2 2,222 Sep-08-2019, 12:50 AM
Last Post: Involute
  please help this newb install pygame iofhua 7 5,954 May-15-2019, 01:09 PM
Last Post: buran
  Newb question: Debugging + Linting Python in Visual Studio Code Drone4four 1 2,437 Apr-15-2019, 06:19 AM
Last Post: perfringo
  Newb question about %02d %04d bennylava 30 19,554 Mar-05-2019, 11:23 PM
Last Post: snippsat
  Pthyon 3 question (newb) bennylava 11 5,887 Feb-28-2019, 06:04 PM
Last Post: buran
  newb selfie PatM 5 3,624 Feb-19-2019, 12:20 AM
Last Post: snippsat
  Newb Question - Threading in Crons vvarrior 2 2,794 Jul-20-2018, 08:12 PM
Last Post: vvarrior
  Matt's newb question 1 MattSS102 1 2,712 Aug-28-2017, 03:27 AM
Last Post: BerlingSwe

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020