Python Forum
Deleting rows based on cell value in Excel
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Deleting rows based on cell value in Excel
#1
The following snippet of code deletes empty rows. How can I change the code such that a row is deleted if a value of 'x' appears in any row in column C?

for row in ws.iter_rows():
    if not any([cell.value for cell in row[1:]]):
        ws.delete_rows(row[0].row)
Reply
#2
This Excel file contains some names containing X or x in column 2 and some None values in column 2 or 3

Before the Excel looks like this:

Output:
>>> for row in sF[sFsheet].values: print(row) ('id', 'name', 'Age') (1, 'Peter', 1) (2, 'Xavier', 2) (3, 'George', 3) (4, None, None) (5, 'Peggy-Sue', 4) (6, 'Xander', 5) (7, 'Legolas', 6) (8, 'Happy', 7) (9, None, None) (10, 'Fatty', 8) (11, 'Sox', 9) (12, 'Justin', 10)
This deletes rows containing a name in which the letters X or x appear

def myApp():
    path2XL = '/home/pedro/myPython/openpyxl/xlsx_files/'
    file = 'some_X_or_x_in_column_2.xlsx'

    sF = openpyxl.load_workbook(path2XL + file)
    sFsheets = sF.sheetnames
    sFsheet = sFsheets[0]
    sFmaxRow = sF[sFsheet].max_row

    # open a new XL
    name = file.split('.')
    savename = name[0] + '_modified.xlsx'
    newXL = openpyxl.Workbook()
    sheets = newXL.sheetnames
    sheet = sheets[0]

    # before
    for row in sF[sFsheet].values:
        print(row)

    # you can choose the rows and columns you want to iterate
    # for row in sF[sFsheet].iter_rows(min_row=1, min_col=1, max_row=6, max_col=2):
    # look through column 2
    # if a name contains X or x delete that row
    # row[1] is the value in column 2 of the XL
    # if there is an X or x in a header name, the headers row will be gone! Fix that!
    for row in sF[sFsheet].iter_rows():
        if row[1].value == None:
            continue
        if 'X' in row[1].value or 'x' in row[1].value:
            sF[sFsheet].delete_rows(row[0].row)
        
    # after
    for row in sF[sFsheet].values:
        print(row)

    sF.save(path2XL + savename)
Reply
#3
Fantastic, many thanks.
Reply
#4
I find, in Python, there is almost always more than 1 way to do things.

This creates a new modified file, where names with X or x or None values are not copied.

def myApp():
    import openpyxl

    path2XL = '/home/pedro/myPython/openpyxl/xlsx_files/'
    file = 'some_X_or_x_in_column_2.xlsx'

    sF = openpyxl.load_workbook(path2XL + file)
    sFsheets = sF.sheetnames
    sFsheet = sFsheets[0]
    sFmaxRow = sF[sFsheet].max_row

    # open a new XL
    name = file.split('.')
    savename = name[0] + '_modified.xlsx'
    newXL = openpyxl.Workbook()
    sheets = newXL.sheetnames
    sheet = sheets[0]

    # copy straight from sF, except if you find X or x in column 2
    count = 1
    for row in sF[sFsheet].values:
        print(row)
        if count == 1:
            print('Inserting the headers ... ')
            for i in range(len(row)):
                newXL[sheet].cell(row=count, column=i+1).value=row[i]
            count +=1
            continue
        if row[1] == None:
            print('Found a None value ... ')
            continue
        elif 'X' in row[1] or 'x' in row[1]:
            print('Found an X or x in column 2 ... ')
            continue
        else:
            print('row is', row) 
            for i in range(len(row)):
                print('row[i] is', row[i])
                newXL[sheet].cell(row=count, column=i+1).value=row[i]
            count +=1

    newXL.save(path2XL + savename)
Reply
#5
I hadn't got a chance until now to run the code. I get an error: -

    if 'X' in row[1].value or 'x' in row[1].value:
TypeError: argument of type 'int' is not iterable
Reply
#6
Funny, it works for me.

In post #2 I should have saved sF, not newXL, or that will be empty. Actually, you don't need newXL at all in post #2

Below is the output in Idle for post #4
Output:
>>> myApp() ('id', 'name', 'Age') Inserting the headers ... (1, 'Peter', 1) row is (1, 'Peter', 1) row[i] is 1 row[i] is Peter row[i] is 1 (2, 'Xavier', 2) Found an X or x in column 2 ... (3, 'George', 3) row is (3, 'George', 3) row[i] is 3 row[i] is George row[i] is 3 (4, None, None) Found a None value ... (5, 'Peggy-Sue', 4) row is (5, 'Peggy-Sue', 4) row[i] is 5 row[i] is Peggy-Sue row[i] is 4 (6, 'Xander', 5) Found an X or x in column 2 ... (7, 'Legolas', 6) row is (7, 'Legolas', 6) row[i] is 7 row[i] is Legolas row[i] is 6 (8, 'Happy', 7) row is (8, 'Happy', 7) row[i] is 8 row[i] is Happy row[i] is 7 (9, None, None) Found a None value ... (10, 'Fatty', 8) row is (10, 'Fatty', 8) row[i] is 10 row[i] is Fatty row[i] is 8 (11, 'Sox', 9) Found an X or x in column 2 ... (12, 'Justin', 10) row is (12, 'Justin', 10) row[i] is 12 row[i] is Justin row[i] is 10
I can't even get Pycharm to work, or I'd try it there!

I'm sure you'll find the problem!
Reply
#7
OK thanks I will need to play with it and see why it's not working.
Reply
#8
I changed (based on post #2)

if 'X' in row[1].value or 'x' in row[1].value:
to

if row[1].value == 'X' or row[1].value=='x':
and it works now.

Referencing post #4, I get an error on line (which I have changed, in line with I have said above)

elif row[1].value == 'x' or row[1].value=='x':
I get the error

AttributeError: 'str' object has no attribute 'value'
What am I doing wrong?
Reply
#9
I don't know, sorry!

I copied and pasted Post #4 straight into Idle and it worked exactly as I hoped!

I didn't change anything.

Change it back and try again!

This is post #2 changed, because I didn't need a new Excel for post #2

def myApp():
    import openpyxl
    path2XL = '/home/pedro/myPython/openpyxl/xlsx_files/'
    file = 'some_X_or_x_in_column_2.xlsx'
    name = file.split('.')
    savename = name[0] + '_modified.xlsx'
 
    sF = openpyxl.load_workbook(path2XL + file)
    sFsheets = sF.sheetnames
    sFsheet = sFsheets[0]
    sFmaxRow = sF[sFsheet].max_row
 
    # before
    for row in sF[sFsheet].values:
        print(row)
 
    # you can choose the rows and columns you want to iterate
    # for row in sF[sFsheet].iter_rows(min_row=1, min_col=1, max_row=6, max_col=2):
    # look through column 2
    # if a name contains X or x delete that row
    # row[1] is the value in column 2 of the XL
    # if there is an X or x in a header name, the headers row will be gone! Fix that!
    for row in sF[sFsheet].iter_rows():
        if row[1].value == None:
            continue
        if 'X' in row[1].value or 'x' in row[1].value:
            sF[sFsheet].delete_rows(row[0].row)
         
    # after
    for row in sF[sFsheet].values:
        print(row)
 
    sF.save(path2XL + savename)
From this, you can see how .delete_rows works: it adds and empty row, then deletes, otherwise you would get and index error,
Reply
#10
It may be that I am using PyCharm environment that is causing the issue. Sigh.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Copy Paste excel files based on the first letters of the file name Viento 2 449 Feb-07-2024, 12:24 PM
Last Post: Viento
  Color a table cell based on specific text Creepy 11 2,008 Jul-27-2023, 02:48 PM
Last Post: deanhystad
  How to properly format rows and columns in excel data from parsed .txt blocks jh67 7 1,887 Dec-12-2022, 08:22 PM
Last Post: jh67
  How to assign a value to pandas dataframe column rows based on a condition klllmmm 0 846 Sep-08-2022, 06:32 AM
Last Post: klllmmm
  The code I have written removes the desired number of rows, but wrong rows Jdesi1983 0 1,636 Dec-08-2021, 04:42 AM
Last Post: Jdesi1983
  Calculate next rows based on previous values of array divon 0 1,787 Nov-23-2021, 04:44 AM
Last Post: divon
  Using Excel Cell As A Variable In A Loop knight2000 7 4,135 Aug-25-2021, 12:43 PM
Last Post: snippsat
  Using Excel Cell As A Variable In A Loop knight2000 7 5,045 Jul-18-2021, 10:52 AM
Last Post: knight2000
  Pandas DataFrame combine rows by column value, where Date Rows are NULL rhat398 0 2,122 May-04-2021, 10:51 PM
Last Post: rhat398
Smile Set 'Time' format cell when writing data to excel and not 'custom' limors 3 6,318 Mar-29-2021, 09:36 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020