Posts: 1,094
Threads: 143
Joined: Jul 2017
I made a dictionary where the key is an integer, a student number and the value is also an integer, the student's score.
I want to loop through each row of the source file column 2, which is the student number and loop through the dictionary. If it finds the student number in that row, it should write corresponding value in that row, column 4 of the output file.
But, I am not getting any values inserted in the output file. I get the dictionary with this:
StudentNumAndScore = {}
for rowNum in range(2, maxRow +1):
StudentNumAndScore[activeSheetwb1.cell(row=rowNum, column=2).value] = activeSheetwb1.cell(row=rowNum, column=1).value and activeSheetwb1.cell(row=rowNum, column=2).value is an integer like: 1625010401
Posts: 8,167
Threads: 160
Joined: Sep 2016
First of all, to answer the question in the thread title - yes, an int can be a key in the dict. actually any hashable object can be (e.g. tuple).
As per the problem in the thread body - post the full code (minimal runnable snippet). From what you have shown it just populates the StudentNumAndScore dict using column 2 values as key and column1 values as values in the dict.
Posts: 1,094
Threads: 143
Joined: Jul 2017
Sep-28-2017, 01:02 PM
(This post was last modified: Sep-28-2017, 01:02 PM by Pedroski55.)
Thanks. I've been trying various approaches with no luck. Here is what I want to do. I thought a dictionary would be the best approach. I normally do this by hand, but I am reading the book "Automate the Boring Stuff" and this is boring! Students go away for a month to a dancing competition in Beijing, or a work experience in Shanghai. I move them to the bottom of my class lists. Slowly the order and the database order do not resemble each other. It's easy to get the wrong score next to a name.
#! /usr/bin/python3
# get the student number and score, then copy score to another file
# at the same time check to see if the student number corresponds to the score
# because the order of the students in the class register is not always the same as the database
import os, openpyxl
pathToExcel = input()
nameOfExcelFile = input()
saveFileName = nameOfExcelFile.split('.')
# this file exists
# it is the results file from the marking program saved as .xlsx
# looks like 16Eng4W4res.xlsx
wb1 = openpyxl.load_workbook(pathToExcel + nameOfExcelFile)
# this file only has 1 sheet so
activeSheetwb1 = wb1.active
# get max Row
maxRow = activeSheetwb1.max_row
# column 1 contains the score, column 2 the student numbers
# read the student numbers into a dictionary
# row 1 is just column headers, don't need these. Start at row 2
# StudentNumAndScore[activeSheetwb1.cell(row=rowNum, column=2).value] is the key
# the key is the student number
# activeSheetwb1.cell(row=rowNum, column=1).value is the score
StudentNumAndScore = {}
for rowNum in range(2, maxRow +1):
StudentNumAndScore[activeSheetwb1.cell(row=rowNum, column=2).value] = activeSheetwb1.cell(row=rowNum, column=1).value
# now open the file to be written to
# the output file exists, just the student numbers are not in the same order
# as the marking program output
# I want to put each score against the correct student number
# I thought a dictionary would do this elegantly
outputFile = input()
wb2 = openpyxl.load_workbook(pathToExcel + outputFile)
# later on I can loop through sheetNameswb2
sheetNameswb2 = wb2.get_sheet_names()
activeSheetwb2 = wb2.get_sheet_by_name(sheetNameswb2[0])
# get the keys and put them in a list. How to do this directly from the dictionary??
keys = list(StudentNumAndScore.keys())
# for each key loop the rows in column 1 and find the key
# all these keys are in the active sheet, but I do not get any output
# nothing is written to "activeSheetwb2.cell(row=rowNum, column=4, value=StudentNumAndScore[key])"
for key in keys:
for rowNum in range(2, maxRow + 1): # check each row for the key. Each key is present somewhere
if key == activeSheetwb2.cell(row=rowNum, column=1).value:
activeSheetwb2.cell(row=rowNum, column=4, value=StudentNumAndScore[key])
# save the file
wb2.save(pathToExcel + 'try1.xlsx') I am not getting anything in try1.xlsx column 4!
Also, I'd like a more elegant use of the dictionary. I feel there is a better way to do this.
I can do this by directly comparing wb1 and wb2 in 2 loops.
Very grateful for any tips, suggestions, hints!
Posts: 8,167
Threads: 160
Joined: Sep 2016
Sep-28-2017, 01:29 PM
(This post was last modified: Sep-28-2017, 01:29 PM by buran.)
Not sure I understand your setup right, but anyway - once you get the data in the dict, you don't need to put the student numbers in a list. just loop over the cells with the student number in the output file and use the respective value to retrieve the correct value from the already populated dict. in the above snippet replace lines 56-65 with this one
maxRow2 = activeSheetwb2.max_row
for rowNum in range(2, maxRow2 + 1):
key = activeSheetwb2.cell(row=rowNum, column=1).value
value = StudentNumAndScore.get(key,'')
activeSheetwb2.cell(row=rowNum, column=4, value=value)
Posts: 1,094
Threads: 143
Joined: Jul 2017
Sep-30-2017, 04:24 AM
(This post was last modified: Sep-30-2017, 08:17 AM by Pedroski55.)
Thank you very much, that is just what I wanted. Neat, concise. Just 1 hiccup: I still get nothing in column 4 after saving! That is weird!
The first key in the dictionary, for some reason, is 1625010432
In the shell, using this key I get this:
Quote:>>> StudentNumAndScore[1625010432]
23
>>>
Your code should write 23 to column 4 in the same row as student number 1625010432.
Quote:outputFile = input()
16Eng4ScoresSorted.xlsx
After saving the file:
Quote:wb2.save(pathToExcel + outputFile)
there is nothing in column 4! I can't figure this out!
Posts: 8,167
Threads: 160
Joined: Sep 2016
import openpyxl
xls_in='in.xlsx'
xls_out='out.xlsx'
wb1 = openpyxl.load_workbook(xls_in)
active_sheet_wb1 = wb1.active
max_row = active_sheet_wb1.max_row
students_and_scores = {}
for row_num in range(2, max_row +1):
key = active_sheet_wb1.cell(row=row_num, column=2).value
students_and_scores[key] = active_sheet_wb1.cell(row=row_num, column=1).value
wb2 = openpyxl.load_workbook(xls_out)
active_sheet_wb2 = wb2.active #get_sheet_by_name(sheetNameswb2[0])
max_row2 = active_sheet_wb2.max_row
for row_num in range(2, max_row + 1):
key = active_sheet_wb2.cell(row=row_num, column=1).value
value = students_and_scores.get(key,'')
active_sheet_wb2.cell(row=row_num, column=4, value=value)
wb2.save('try.xlsx') this is minimal working script, tested by me. feel free to modify it.
Posts: 1,094
Threads: 143
Joined: Jul 2017
Oct-04-2017, 11:50 PM
(This post was last modified: Oct-04-2017, 11:51 PM by Pedroski55.)
Thanks for your trouble, I appreciate it! I am still tinkering with this and it is STILL not working.
Attached is a screen shot of the output file. You can see that row 4, column 2 is the first student number. I tried the insert the values in column 8, then column 12. I always save as a copy, just to be careful.
This is from the interactive shell:
Quote:>>> activeSheetwb2.cell(row=4, column=2).value
'1625010401'
>>> StudentNumAndScore[1625010401]
16
>>> key = 1625010401
>>> value = StudentNumAndScore.get(key,'nix')
>>> value
16
The key is correct, starting at row 4, column 2 and the value is correct, 16:
for rowNum in range(4, maxRow2 + 1):
key = activeSheetwb2.cell(row=rowNum, column=2).value
value = StudentNumAndScore.get(key, 'nix')
activeSheetwb2.cell(row=rowNum, column=12, value=value) If a key is not found, I set the value 'nix', meaning 'nothing'. From the screenshot, you can see, all I get is 'nix'.
Can anyone see what the problem is? Why do I not get the 'value' in the correct row and column?
EDIT: I finally answered my own question! If I write int(key) it works!!
for rowNum in range(4, 10):
key = activeSheetwb2.cell(row=rowNum, column=2).value
value = StudentNumAndScore.get(int(key), 'nix')
print(key, value)
Quote:1625010401 16
1625010402 15
1625010404 16
1625010405 17
1625010406 15
1625010407 18
Posts: 8,167
Threads: 160
Joined: Sep 2016
Well, openpyxl guess the type and in your case in the input file student numbers are actually numbers. That is visible from the example in post#5.
Now it is clear that in your target file student numbers are text. Note single quotes in your example. And that is something new that you didn't mention so far...
If I have to guess for some reason the cells were formatted as text.
|