Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
numpyxl cell values
#1
OK, I am trying to use numpyxl for the first time. Here are some factoids

The input file looks like this (well, this is the .csv, because I can't insert a picture). I've lined up the commas so it is obvious what is going on

# ,Mfg,Cabinet,Drawer,Description
51,   ,      A,      ,
80,   ,      A,      ,
85,   ,      A,      ,
126,  ,      A,      ,
128,  ,      A,      ,
...for another 552 lines
Now, I want to identify the columns that have certain headings in them. To do this, I first tried to just enumerate row 1.
import openpyxl
# ...
# Open the workspace
ws = openpyxl.load_workbook(filename = "inventory.xlsx")
# ...
for i in range(1,20):
    cell = openpyxl.cell.cell.Cell(ws, row = 1, column = i)
    caption = cell.value
    if caption != None:
       print(cell.coordinate + ":" + caption, end = " ")
    else:
       print(cell.coordinate, end=": ")
       print("<None>")
print("")
Here is the workspace after the load_workspace operation:
Output:
- ws <openpyxl.workbook.workbook.Workbook object at 0x000001BD945CD070> Workbook + special variables + protected variables + function variables + active <Worksheet "Sorted"> Worksheet + calculation <openpyxl.workbook.properties.CalcProperties object> Parameters: calcId=0, calcMode=None, fullCalcOnLoad=True, refMode=None, iterate=None, iterateCount=None, iterateDelta=None, fullPrecision=None, calcCompleted=None, calcOnSave=None, concurrentCalc=None, concurrentManualCount=None, forceFullCalc=None CalcProperties + chartsheets [] list code_name None NoneType data_only False bool + defined_names <openpyxl.workbook.defined_name.DefinedNameList object> Parameters: definedName=[] DefinedNameList encoding 'utf-8' str + epoch datetime.datetime(1899, 12, 30, 0, 0) datetime + excel_base_date datetime.datetime(1899, 12, 30, 0, 0) datetime is_template False bool iso_dates False bool + loaded_theme ...long boring stuff... bytes mime_type 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml' str + named_styles ['Normal'] list path '/xl/workbook.xml' str + properties <openpyxl.packaging.core.DocumentProperties object> Parameters: creator='openpyxl', title=None, description=None, subject=None, identifier=None, language=None, created=datetime.datetime(2021, 3, 14, 17, 35, 37, 585452), modified=datetime.datetime(2021, 3, 14, 15, 29, 53), lastModifiedBy='...elided...', category=None, contentStatus=None, version=None, revision=None, keywords=None, lastPrinted=None DocumentProperties read_only False bool + rels <openpyxl.packaging.relationship.RelationshipList object> Parameters: Relationship=[] RelationshipList security None NoneType + shared_strings [] IndexedList + sheetnames [...elided...] list + style_names ['Normal'] list template False bool vba_archive None NoneType + views [<openpyxl.workbook.v...uping=True] list + worksheets [...elided...] list write_only False bool
If I examine the cell, I find that it is cell A1 (the .coordinate) but the value is None
Output:
- cell <openpyxl.cell.cell.Cell object at 0x000001BD9466F040> Cell + special variables + protected variables + function variables alignment ...long boring traceback... str base_date ...long boring traceback... str border ..long boring traceback... str col_idx 1 int column 1 int column_letter 'A' str comment None NoneType coordinate 'A1' str data_type 'n' str encoding 'utf-8' str fill ...long boring traceback... str font ...long boring traceback... str has_style False bool hyperlink None NoneType internal_value None NoneType is_date False bool number_format 'General' str + parent <openpyxl.workbook.workbook.Workbook object at 0x000001BD945CD070> Workbook pivotButton False bool protection ...long boring traceback... str quotePrefix False bool row 1 int style ...long boring traceback... str style_id ...long boring traceback... str value None NoneType
If I open this file in Excel, it shows me the values I illustrated in the .csv file.

The output I get is
Output:
A1: <None> B1: <None> C1: <None> D1: <None> E1: <None> F1: <None> G1: <None> H1: <None> I1: <None> J1: <None> K1: <None> L1: <None> M1: <None> N1: <None> O1: <None> P1: <None> Q1: <None> R1: <None> S1: <None>
What I expected to see was
Output:
A1: # A2: Mfg A3: Cabinet A4: Drawer A5: Description ...A6 would be blank, or maybe None, and I would have stopped the loop
What I then want to do is read cells A2..Annn where the cell value in Annn+1 is blank, or possibly None.
What I will do with the number I read is a bit of Web scraping that gets the Description. So once I have the part number from cell Axxx, if the Description, nominally Exxx, is blank (or None), I will make a Web query, which obtains the description for that part, and I will put that string in the Description field. Because I want to support schema evolution, I wrote the loop to iterate over row {A..S}1 (once I figure out whether it is an empty string or None, I will exit the heading loop when I no longer have a heading specified).

So the question is, what have I done wrong that I can't get the values from the cells? The documentation is fairly weak.

I am using Python 3.9.2 in Visual Studio 2019 under Windows 10.
Reply
#2
I have working code. My problem was I trusted the documentation.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Need to copy column of cell values from one workbook to another with openpyxl curranjohn46 3 11,224 Oct-12-2019, 10:57 PM
Last Post: curranjohn46

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020