numpyxl cell values - Printable Version

numpyxl cell values - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: numpyxl cell values (/thread-32895.html)

numpyxl cell values - supuflounder - Mar-14-2021

OK, I am trying to use numpyxl for the first time. Here are some factoids

The input file looks like this (well, this is the .csv, because I can't insert a picture). I've lined up the commas so it is obvious what is going on

# ,Mfg,Cabinet,Drawer,Description
51,   ,      A,      ,
80,   ,      A,      ,
85,   ,      A,      ,
126,  ,      A,      ,
128,  ,      A,      ,
...for another 552 lines

Now, I want to identify the columns that have certain headings in them. To do this, I first tried to just enumerate row 1.

import openpyxl
# ...
# Open the workspace
ws = openpyxl.load_workbook(filename = "inventory.xlsx")
# ...
for i in range(1,20):
    cell = openpyxl.cell.cell.Cell(ws, row = 1, column = i)
    caption = cell.value
    if caption != None:
       print(cell.coordinate + ":" + caption, end = " ")
    else:
       print(cell.coordinate, end=": ")
       print("<None>")
print("")

Here is the workspace after the load_workspace operation:

Output:-		ws	<openpyxl.workbook.workbook.Workbook object at 0x000001BD945CD070>	Workbook
+		special variables		
+		protected variables		
+		function variables		
+		active	<Worksheet "Sorted">	Worksheet
+		calculation	<openpyxl.workbook.properties.CalcProperties object>
Parameters:
calcId=0, calcMode=None, fullCalcOnLoad=True, refMode=None, iterate=None, iterateCount=None, iterateDelta=None, fullPrecision=None, calcCompleted=None, calcOnSave=None, concurrentCalc=None, concurrentManualCount=None, forceFullCalc=None	CalcProperties
+		chartsheets	[]	list
		code_name	None	NoneType
		data_only	False	bool
+		defined_names	<openpyxl.workbook.defined_name.DefinedNameList object>
Parameters:
definedName=[]	DefinedNameList
		encoding	'utf-8'	str
+		epoch	datetime.datetime(1899, 12, 30, 0, 0)	datetime
+		excel_base_date	datetime.datetime(1899, 12, 30, 0, 0)	datetime
		is_template	False	bool
		iso_dates	False	bool
+		loaded_theme	...long boring stuff... 	bytes
		mime_type	'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml'	str
+		named_styles	['Normal']	list
		path	'/xl/workbook.xml'	str
+		properties	<openpyxl.packaging.core.DocumentProperties object>
Parameters:
creator='openpyxl', title=None, description=None, subject=None, identifier=None, language=None, created=datetime.datetime(2021, 3, 14, 17, 35, 37, 585452), modified=datetime.datetime(2021, 3, 14, 15, 29, 53), lastModifiedBy='...elided...', category=None, contentStatus=None, version=None, revision=None, keywords=None, lastPrinted=None	DocumentProperties
		read_only	False	bool
+		rels	<openpyxl.packaging.relationship.RelationshipList object>
Parameters:
Relationship=[]	RelationshipList
		security	None	NoneType
+		shared_strings	[]	IndexedList
+		sheetnames	[...elided...]	list
+		style_names	['Normal']	list
		template	False	bool
		vba_archive	None	NoneType
+		views	[<openpyxl.workbook.v...uping=True]	list
+		worksheets	[...elided...]	list
		write_only	False	bool

If I examine the cell, I find that it is cell A1 (the .coordinate) but the value is None

Output:-		cell	<openpyxl.cell.cell.Cell object at 0x000001BD9466F040>	Cell
+		special variables		
+		protected variables		
+		function variables		
		alignment	...long boring traceback... str
		base_date	...long boring traceback... str
		border	..long boring traceback... 	str
		col_idx	1	int
		column	1	int
		column_letter	'A'	str
		comment	None	NoneType
		coordinate	'A1'	str
		data_type	'n'	str
		encoding	'utf-8'	str
		fill	...long boring traceback... str
		font	...long boring traceback... str
		has_style	False	bool
		hyperlink	None	NoneType
		internal_value	None	NoneType
		is_date	False	bool
		number_format	'General'	str
+		parent	<openpyxl.workbook.workbook.Workbook object at 0x000001BD945CD070>	Workbook
		pivotButton	False	bool
		protection	...long boring traceback... 	str
		quotePrefix	False	bool
		row	1	int
		style	...long boring traceback... 	str
		style_id	...long boring traceback... 	str
		value	None	NoneType

If I open this file in Excel, it shows me the values I illustrated in the .csv file.

The output I get is

Output:A1: <None>
B1: <None>
C1: <None>
D1: <None>
E1: <None>
F1: <None>
G1: <None>
H1: <None>
I1: <None>
J1: <None>
K1: <None>
L1: <None>
M1: <None>
N1: <None>
O1: <None>
P1: <None>
Q1: <None>
R1: <None>
S1: <None>

What I expected to see was

Output:A1: #
A2: Mfg
A3: Cabinet
A4: Drawer
A5: Description
...A6 would be blank, or maybe None, and I would have stopped the loop

What I then want to do is read cells A2..Annn where the cell value in Annn+1 is blank, or possibly None.
What I will do with the number I read is a bit of Web scraping that gets the Description. So once I have the part number from cell Axxx, if the Description, nominally Exxx, is blank (or None), I will make a Web query, which obtains the description for that part, and I will put that string in the Description field. Because I want to support schema evolution, I wrote the loop to iterate over row {A..S}1 (once I figure out whether it is an empty string or None, I will exit the heading loop when I no longer have a heading specified).

So the question is, what have I done wrong that I can't get the values from the cells? The documentation is fairly weak.

I am using Python 3.9.2 in Visual Studio 2019 under Windows 10.

RE: numpyxl cell values - supuflounder - Mar-21-2021

I have working code. My problem was I trusted the documentation.