numpyxl cell values - supuflounder - Mar-14-2021
OK, I am trying to use numpyxl for the first time. Here are some factoids
The input file looks like this (well, this is the .csv, because I can't insert a picture). I've lined up the commas so it is obvious what is going on
# ,Mfg,Cabinet,Drawer,Description
51, , A, ,
80, , A, ,
85, , A, ,
126, , A, ,
128, , A, ,
...for another 552 lines Now, I want to identify the columns that have certain headings in them. To do this, I first tried to just enumerate row 1.
import openpyxl
# ...
# Open the workspace
ws = openpyxl.load_workbook(filename = "inventory.xlsx")
# ...
for i in range(1,20):
cell = openpyxl.cell.cell.Cell(ws, row = 1, column = i)
caption = cell.value
if caption != None:
print(cell.coordinate + ":" + caption, end = " ")
else:
print(cell.coordinate, end=": ")
print("<None>")
print("") Here is the workspace after the load_workspace operation:
Output: - ws <openpyxl.workbook.workbook.Workbook object at 0x000001BD945CD070> Workbook
+ special variables
+ protected variables
+ function variables
+ active <Worksheet "Sorted"> Worksheet
+ calculation <openpyxl.workbook.properties.CalcProperties object>
Parameters:
calcId=0, calcMode=None, fullCalcOnLoad=True, refMode=None, iterate=None, iterateCount=None, iterateDelta=None, fullPrecision=None, calcCompleted=None, calcOnSave=None, concurrentCalc=None, concurrentManualCount=None, forceFullCalc=None CalcProperties
+ chartsheets [] list
code_name None NoneType
data_only False bool
+ defined_names <openpyxl.workbook.defined_name.DefinedNameList object>
Parameters:
definedName=[] DefinedNameList
encoding 'utf-8' str
+ epoch datetime.datetime(1899, 12, 30, 0, 0) datetime
+ excel_base_date datetime.datetime(1899, 12, 30, 0, 0) datetime
is_template False bool
iso_dates False bool
+ loaded_theme ...long boring stuff... bytes
mime_type 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml' str
+ named_styles ['Normal'] list
path '/xl/workbook.xml' str
+ properties <openpyxl.packaging.core.DocumentProperties object>
Parameters:
creator='openpyxl', title=None, description=None, subject=None, identifier=None, language=None, created=datetime.datetime(2021, 3, 14, 17, 35, 37, 585452), modified=datetime.datetime(2021, 3, 14, 15, 29, 53), lastModifiedBy='...elided...', category=None, contentStatus=None, version=None, revision=None, keywords=None, lastPrinted=None DocumentProperties
read_only False bool
+ rels <openpyxl.packaging.relationship.RelationshipList object>
Parameters:
Relationship=[] RelationshipList
security None NoneType
+ shared_strings [] IndexedList
+ sheetnames [...elided...] list
+ style_names ['Normal'] list
template False bool
vba_archive None NoneType
+ views [<openpyxl.workbook.v...uping=True] list
+ worksheets [...elided...] list
write_only False bool
If I examine the cell, I find that it is cell A1 (the .coordinate) but the value is None
Output: - cell <openpyxl.cell.cell.Cell object at 0x000001BD9466F040> Cell
+ special variables
+ protected variables
+ function variables
alignment ...long boring traceback... str
base_date ...long boring traceback... str
border ..long boring traceback... str
col_idx 1 int
column 1 int
column_letter 'A' str
comment None NoneType
coordinate 'A1' str
data_type 'n' str
encoding 'utf-8' str
fill ...long boring traceback... str
font ...long boring traceback... str
has_style False bool
hyperlink None NoneType
internal_value None NoneType
is_date False bool
number_format 'General' str
+ parent <openpyxl.workbook.workbook.Workbook object at 0x000001BD945CD070> Workbook
pivotButton False bool
protection ...long boring traceback... str
quotePrefix False bool
row 1 int
style ...long boring traceback... str
style_id ...long boring traceback... str
value None NoneType
If I open this file in Excel, it shows me the values I illustrated in the .csv file.
The output I get is
Output: A1: <None>
B1: <None>
C1: <None>
D1: <None>
E1: <None>
F1: <None>
G1: <None>
H1: <None>
I1: <None>
J1: <None>
K1: <None>
L1: <None>
M1: <None>
N1: <None>
O1: <None>
P1: <None>
Q1: <None>
R1: <None>
S1: <None>
What I expected to see was
Output: A1: #
A2: Mfg
A3: Cabinet
A4: Drawer
A5: Description
...A6 would be blank, or maybe None, and I would have stopped the loop
What I then want to do is read cells A2..Annn where the cell value in Annn+1 is blank, or possibly None.
What I will do with the number I read is a bit of Web scraping that gets the Description. So once I have the part number from cell Axxx, if the Description, nominally Exxx, is blank (or None), I will make a Web query, which obtains the description for that part, and I will put that string in the Description field. Because I want to support schema evolution, I wrote the loop to iterate over row {A..S}1 (once I figure out whether it is an empty string or None, I will exit the heading loop when I no longer have a heading specified).
So the question is, what have I done wrong that I can't get the values from the cells? The documentation is fairly weak.
I am using Python 3.9.2 in Visual Studio 2019 under Windows 10.
RE: numpyxl cell values - supuflounder - Mar-21-2021
I have working code. My problem was I trusted the documentation.
|