Python Forum
Identify Hidden rows in xls - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Identify Hidden rows in xls (/thread-9452.html)



Identify Hidden rows in xls - ayonsarkar - Apr-10-2018

Hi,

I have a xls file which is large and I am using xlrd to read the file. However there are hidden rows in the xls which i donot want to read.

I am using open_workbook to open the file, and getting the sheet object by workbook.sheet_by_name.

How can i check if a certain row is hidden? Can you please suggest? From the forums I see people are suggesting to use xlrd.sheet.RowInfo, but how i do set the row info from the sheet object?

It says, rowinfo is Derived from ROW records. How can i get Row records? Any help will be appreicated?


RE: Identify Hidden rows in xls - ayonsarkar - Apr-10-2018

any suggestions?


RE: Identify Hidden rows in xls - ljmetzger - Apr-10-2018

Try the following code which works for .xls files (Excel 2003 and earlier only). The code can get the values from .xlsx files (Excel 2007 and later), but xlrd DOES NOT support obtaining the HIDDEN attribute for .xlsx files. That must be done using a different package such as openpyxl.
#Reference: xlrd manual: https://media.readthedocs.org/pdf/xlrd/latest/xlrd.pdf
#Python Forum Reference: https://python-forum.io/Thread-Identify-Hidden-rows-in-xls
import xlrd

print("Read the VALUE and ROW VISIBILITY from cells A1:A6 in a .xls file from 'Sheet2'.")
print()

######################################################
# Access .xls file (Excel 2003 and before)
excel_filename = "HiddenRow3OnSheet2.xls"

# Open the workbook
#NOTE: Traceback error if 'formatting_info=True' is NOT INCLUDED
xl_workbook = xlrd.open_workbook(excel_filename, formatting_info=True)

#Set the focus on 'Sheet2'
my_sheet_name = "Sheet2"
xl_sheet = xl_workbook.sheet_by_name(my_sheet_name)

print("File: {}".format(excel_filename))
for irow in range(xl_sheet.nrows):
    ihidden = xl_sheet.rowinfo_map[irow].hidden   #Row Visibility 0=Visible 1=Hidden
    if ihidden == True:
      shidden = "VISIBLE"
    else:
      shidden = "HIDDEN"
    #
    svalue = xl_sheet.cell(irow,0).value
    print("Value: {}        Row Visibility: {}".format(svalue, shidden))
    

######################################################
# Access .xlsx file (Excel 2007 and later)
excel_filename = "HiddenRow3OnSheet2.xlsx"

# Open the workbook
#NOTE: 'formatting_info=True' is NOT SUPPORTED for .xlsx files
xl_workbook = xlrd.open_workbook(excel_filename)

#Set the focus on 'Sheet2'
my_sheet_name = "Sheet2"
xl_sheet = xl_workbook.sheet_by_name(my_sheet_name)

print()
print("File: {}".format(excel_filename))
for irow in range(xl_sheet.nrows):
    svalue = xl_sheet.cell(irow,0).value
    print("Value: {}       Row Visibility: {}".format(svalue, "Not Available for .xlsx files"))
Output:
Read the VALUE and ROW VISIBILITY from cells A1:A6 in a .xls file from 'Sheet2'. File: HiddenRow3OnSheet2.xls Value: File 'HiddenRow3OnSheet2.xls' cell A1 Row Visibility: HIDDEN Value: File 'HiddenRow3OnSheet2.xls' cell A2 Row Visibility: HIDDEN Value: File 'HiddenRow3OnSheet2.xls' cell A3 Row Visibility: VISIBLE Value: File 'HiddenRow3OnSheet2.xls' cell A4 Row Visibility: HIDDEN Value: File 'HiddenRow3OnSheet2.xls' cell A5 Row Visibility: HIDDEN Value: File 'HiddenRow3OnSheet2.xls' cell A6 Row Visibility: HIDDEN File: HiddenRow3OnSheet2.xlsx Value: File 'HiddenRow3OnSheet2.xlsx' cell A1 Row Visibility: Not Available for .xlsx files Value: File 'HiddenRow3OnSheet2.xlsx' cell A2 Row Visibility: Not Available for .xlsx files Value: File 'HiddenRow3OnSheet2.xlsx' cell A3 Row Visibility: Not Available for .xlsx files Value: File 'HiddenRow3OnSheet2.xlsx' cell A4 Row Visibility: Not Available for .xlsx files Value: File 'HiddenRow3OnSheet2.xlsx' cell A5 Row Visibility: Not Available for .xlsx files Value: File 'HiddenRow3OnSheet2.xlsx' cell A6 Row Visibility: Not Available for .xlsx files
The following code works for .xlsx files (Excel 2007 and later) using package openpyxl. openpyxl DOES NOT support access to .xls files (Excel 2003 and earlier).
openpyxl download: https://pypi.python.org/pypi/openpyxl
openpyxl documentation: https://openpyxl.readthedocs.io/en/stable/
#Reference: openpyxl documentation: https://openpyxl.readthedocs.io/en/stable/
#openpyxl tutorial Reference:       https://automatetheboringstuff.com/chapter12/
#Python Forum Reference:            https://python-forum.io/Thread-Identify-Hidden-rows-in-xls
import openpyxl

print("Read the VALUE and ROW VISIBILITY from cells A1:A6 in a .xlsx file from 'Sheet2'.")
print()

######################################################
# Access .xls file (Excel 2003 and before)
# NOTE: openpyxl DOES NOT support .xls files
excel_filename = "HiddenRow3OnSheet2.xls"
print("File: {} - .xls format is NOT SUPPORTED by 'openpyxl'.".format(excel_filename))
print()


######################################################
# Access .xlsx file (Excel 2007 and later)
excel_filename = "HiddenRow3OnSheet2.xlsx"

# Open the workbook
#NOTE: 'data_only=True' REQUIRED to get VALUES from cells that contain formulas
xl_workbook = openpyxl.load_workbook(excel_filename, data_only=True)

#Set the focus on 'Sheet2'
my_sheet_name = "Sheet2"
xl_sheet = xl_workbook[my_sheet_name]

print("File: {}".format(excel_filename))
for irow in range(1, xl_sheet.max_row + 1):
    ihidden = xl_sheet.row_dimensions[irow].hidden  #Row Visibility True / False
    if ihidden == True:
      shidden = "HIDDEN"
    else:
      shidden = "VISIBLE" 
    #
    svalue = xl_sheet.cell(irow, 1).value
    print("Value: {}        Row Visibility: {}".format(svalue, shidden))
Output:
Read the VALUE and ROW VISIBILITY from cells A1:A6 in a .xlsx file from 'Sheet2'. File: HiddenRow3OnSheet2.xls - .xls format is NOT SUPPORTED by 'openpyxl'. File: HiddenRow3OnSheet2.xlsx Value: File 'HiddenRow3OnSheet2.xlsx' cell A1 Row Visibility: VISIBLE Value: File 'HiddenRow3OnSheet2.xlsx' cell A2 Row Visibility: VISIBLE Value: File 'HiddenRow3OnSheet2.xlsx' cell A3 Row Visibility: HIDDEN Value: File 'HiddenRow3OnSheet2.xlsx' cell A4 Row Visibility: VISIBLE Value: File 'HiddenRow3OnSheet2.xlsx' cell A5 Row Visibility: VISIBLE Value: File 'HiddenRow3OnSheet2.xlsx' cell A6 Row Visibility: VISIBLE
See the attached zip file that contains:
a. Excel-003-xlrd.py - Python xlrd script
b. Excel-003-openpyxl.py - Python openpyxl script
c. HiddenRow3OnSheet2.xls - Excel 2003 and earlier format data file
d. HiddenRow3OnSheet2.xlsx - Excel 2007 and later format data file

Lewis