Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Identify Hidden rows in xls
#1
Hi,

I have a xls file which is large and I am using xlrd to read the file. However there are hidden rows in the xls which i donot want to read.

I am using open_workbook to open the file, and getting the sheet object by workbook.sheet_by_name.

How can i check if a certain row is hidden? Can you please suggest? From the forums I see people are suggesting to use xlrd.sheet.RowInfo, but how i do set the row info from the sheet object?

It says, rowinfo is Derived from ROW records. How can i get Row records? Any help will be appreicated?
Quote
#2
any suggestions?
Quote
#3
Try the following code which works for .xls files (Excel 2003 and earlier only). The code can get the values from .xlsx files (Excel 2007 and later), but xlrd DOES NOT support obtaining the HIDDEN attribute for .xlsx files. That must be done using a different package such as openpyxl.
#Reference: xlrd manual: https://media.readthedocs.org/pdf/xlrd/latest/xlrd.pdf
#Python Forum Reference: https://python-forum.io/Thread-Identify-Hidden-rows-in-xls
import xlrd

print("Read the VALUE and ROW VISIBILITY from cells A1:A6 in a .xls file from 'Sheet2'.")
print()

######################################################
# Access .xls file (Excel 2003 and before)
excel_filename = "HiddenRow3OnSheet2.xls"

# Open the workbook
#NOTE: Traceback error if 'formatting_info=True' is NOT INCLUDED
xl_workbook = xlrd.open_workbook(excel_filename, formatting_info=True)

#Set the focus on 'Sheet2'
my_sheet_name = "Sheet2"
xl_sheet = xl_workbook.sheet_by_name(my_sheet_name)

print("File: {}".format(excel_filename))
for irow in range(xl_sheet.nrows):
    ihidden = xl_sheet.rowinfo_map[irow].hidden   #Row Visibility 0=Visible 1=Hidden
    if ihidden == True:
      shidden = "VISIBLE"
    else:
      shidden = "HIDDEN"
    #
    svalue = xl_sheet.cell(irow,0).value
    print("Value: {}        Row Visibility: {}".format(svalue, shidden))
    

######################################################
# Access .xlsx file (Excel 2007 and later)
excel_filename = "HiddenRow3OnSheet2.xlsx"

# Open the workbook
#NOTE: 'formatting_info=True' is NOT SUPPORTED for .xlsx files
xl_workbook = xlrd.open_workbook(excel_filename)

#Set the focus on 'Sheet2'
my_sheet_name = "Sheet2"
xl_sheet = xl_workbook.sheet_by_name(my_sheet_name)

print()
print("File: {}".format(excel_filename))
for irow in range(xl_sheet.nrows):
    svalue = xl_sheet.cell(irow,0).value
    print("Value: {}       Row Visibility: {}".format(svalue, "Not Available for .xlsx files"))
Output:
Read the VALUE and ROW VISIBILITY from cells A1:A6 in a .xls file from 'Sheet2'. File: HiddenRow3OnSheet2.xls Value: File 'HiddenRow3OnSheet2.xls' cell A1 Row Visibility: HIDDEN Value: File 'HiddenRow3OnSheet2.xls' cell A2 Row Visibility: HIDDEN Value: File 'HiddenRow3OnSheet2.xls' cell A3 Row Visibility: VISIBLE Value: File 'HiddenRow3OnSheet2.xls' cell A4 Row Visibility: HIDDEN Value: File 'HiddenRow3OnSheet2.xls' cell A5 Row Visibility: HIDDEN Value: File 'HiddenRow3OnSheet2.xls' cell A6 Row Visibility: HIDDEN File: HiddenRow3OnSheet2.xlsx Value: File 'HiddenRow3OnSheet2.xlsx' cell A1 Row Visibility: Not Available for .xlsx files Value: File 'HiddenRow3OnSheet2.xlsx' cell A2 Row Visibility: Not Available for .xlsx files Value: File 'HiddenRow3OnSheet2.xlsx' cell A3 Row Visibility: Not Available for .xlsx files Value: File 'HiddenRow3OnSheet2.xlsx' cell A4 Row Visibility: Not Available for .xlsx files Value: File 'HiddenRow3OnSheet2.xlsx' cell A5 Row Visibility: Not Available for .xlsx files Value: File 'HiddenRow3OnSheet2.xlsx' cell A6 Row Visibility: Not Available for .xlsx files
The following code works for .xlsx files (Excel 2007 and later) using package openpyxl. openpyxl DOES NOT support access to .xls files (Excel 2003 and earlier).
openpyxl download: https://pypi.python.org/pypi/openpyxl
openpyxl documentation: https://openpyxl.readthedocs.io/en/stable/
#Reference: openpyxl documentation: https://openpyxl.readthedocs.io/en/stable/
#openpyxl tutorial Reference:       https://automatetheboringstuff.com/chapter12/
#Python Forum Reference:            https://python-forum.io/Thread-Identify-Hidden-rows-in-xls
import openpyxl

print("Read the VALUE and ROW VISIBILITY from cells A1:A6 in a .xlsx file from 'Sheet2'.")
print()

######################################################
# Access .xls file (Excel 2003 and before)
# NOTE: openpyxl DOES NOT support .xls files
excel_filename = "HiddenRow3OnSheet2.xls"
print("File: {} - .xls format is NOT SUPPORTED by 'openpyxl'.".format(excel_filename))
print()


######################################################
# Access .xlsx file (Excel 2007 and later)
excel_filename = "HiddenRow3OnSheet2.xlsx"

# Open the workbook
#NOTE: 'data_only=True' REQUIRED to get VALUES from cells that contain formulas
xl_workbook = openpyxl.load_workbook(excel_filename, data_only=True)

#Set the focus on 'Sheet2'
my_sheet_name = "Sheet2"
xl_sheet = xl_workbook[my_sheet_name]

print("File: {}".format(excel_filename))
for irow in range(1, xl_sheet.max_row + 1):
    ihidden = xl_sheet.row_dimensions[irow].hidden  #Row Visibility True / False
    if ihidden == True:
      shidden = "HIDDEN"
    else:
      shidden = "VISIBLE" 
    #
    svalue = xl_sheet.cell(irow, 1).value
    print("Value: {}        Row Visibility: {}".format(svalue, shidden))
Output:
Read the VALUE and ROW VISIBILITY from cells A1:A6 in a .xlsx file from 'Sheet2'. File: HiddenRow3OnSheet2.xls - .xls format is NOT SUPPORTED by 'openpyxl'. File: HiddenRow3OnSheet2.xlsx Value: File 'HiddenRow3OnSheet2.xlsx' cell A1 Row Visibility: VISIBLE Value: File 'HiddenRow3OnSheet2.xlsx' cell A2 Row Visibility: VISIBLE Value: File 'HiddenRow3OnSheet2.xlsx' cell A3 Row Visibility: HIDDEN Value: File 'HiddenRow3OnSheet2.xlsx' cell A4 Row Visibility: VISIBLE Value: File 'HiddenRow3OnSheet2.xlsx' cell A5 Row Visibility: VISIBLE Value: File 'HiddenRow3OnSheet2.xlsx' cell A6 Row Visibility: VISIBLE
See the attached zip file that contains:
a. Excel-003-xlrd.py - Python xlrd script
b. Excel-003-openpyxl.py - Python openpyxl script
c. HiddenRow3OnSheet2.xls - Excel 2003 and earlier format data file
d. HiddenRow3OnSheet2.xlsx - Excel 2007 and later format data file

Lewis


Attached Files
.zip   HiddenRow3OnExcelFileSheet2.zip (Size: 16.58 KB / Downloads: 47)
To paraphrase: 'Throw out your dead' code. https://www.youtube.com/watch?v=grbSQ6O6kbs Forward to 1:00
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Need help to identify Mersenne Primes, I do need a search pattern. Pleiades 0 89 Dec-03-2019, 11:05 PM
Last Post: Pleiades
  Substitution with regular expression returns hidden character SOH bajacri 2 129 Nov-17-2019, 03:38 AM
Last Post: bajacri
  Syntax Error : I can't identify what's wrong! caarsonr 11 653 Jun-10-2019, 11:18 PM
Last Post: Yoriz
  Identify two specific words next to each stahorse 9 497 Apr-26-2019, 09:59 AM
Last Post: perfringo
  A hidden library? BillMcEnaney 3 449 Apr-02-2019, 12:00 AM
Last Post: BillMcEnaney
  find hidden python files bobsmith76 4 802 Oct-23-2018, 10:57 PM
Last Post: wavic
  PyMODINIT_FUNC and gcc hidden visibility jock 0 859 Feb-01-2018, 03:47 PM
Last Post: jock
  How dont' show hidden system files Mike Ru 5 2,643 Jan-11-2018, 01:18 PM
Last Post: buran
  Please, help to identify the mistake in the code Alberto 4 1,620 Jun-22-2017, 09:52 AM
Last Post: Alberto
  How to get pyhton to identify symbols in a string MemeLord15 4 1,526 Apr-05-2017, 11:14 AM
Last Post: wavic

Forum Jump:


Users browsing this thread: 1 Guest(s)