Python Forum

Full Version: Deleting rows based on cell value in Excel
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
The following snippet of code deletes empty rows. How can I change the code such that a row is deleted if a value of 'x' appears in any row in column C?

for row in ws.iter_rows():
    if not any([cell.value for cell in row[1:]]):
        ws.delete_rows(row[0].row)
This Excel file contains some names containing X or x in column 2 and some None values in column 2 or 3

Before the Excel looks like this:

Output:
>>> for row in sF[sFsheet].values: print(row) ('id', 'name', 'Age') (1, 'Peter', 1) (2, 'Xavier', 2) (3, 'George', 3) (4, None, None) (5, 'Peggy-Sue', 4) (6, 'Xander', 5) (7, 'Legolas', 6) (8, 'Happy', 7) (9, None, None) (10, 'Fatty', 8) (11, 'Sox', 9) (12, 'Justin', 10)
This deletes rows containing a name in which the letters X or x appear

def myApp():
    path2XL = '/home/pedro/myPython/openpyxl/xlsx_files/'
    file = 'some_X_or_x_in_column_2.xlsx'

    sF = openpyxl.load_workbook(path2XL + file)
    sFsheets = sF.sheetnames
    sFsheet = sFsheets[0]
    sFmaxRow = sF[sFsheet].max_row

    # open a new XL
    name = file.split('.')
    savename = name[0] + '_modified.xlsx'
    newXL = openpyxl.Workbook()
    sheets = newXL.sheetnames
    sheet = sheets[0]

    # before
    for row in sF[sFsheet].values:
        print(row)

    # you can choose the rows and columns you want to iterate
    # for row in sF[sFsheet].iter_rows(min_row=1, min_col=1, max_row=6, max_col=2):
    # look through column 2
    # if a name contains X or x delete that row
    # row[1] is the value in column 2 of the XL
    # if there is an X or x in a header name, the headers row will be gone! Fix that!
    for row in sF[sFsheet].iter_rows():
        if row[1].value == None:
            continue
        if 'X' in row[1].value or 'x' in row[1].value:
            sF[sFsheet].delete_rows(row[0].row)
        
    # after
    for row in sF[sFsheet].values:
        print(row)

    sF.save(path2XL + savename)
Fantastic, many thanks.
I find, in Python, there is almost always more than 1 way to do things.

This creates a new modified file, where names with X or x or None values are not copied.

def myApp():
    import openpyxl

    path2XL = '/home/pedro/myPython/openpyxl/xlsx_files/'
    file = 'some_X_or_x_in_column_2.xlsx'

    sF = openpyxl.load_workbook(path2XL + file)
    sFsheets = sF.sheetnames
    sFsheet = sFsheets[0]
    sFmaxRow = sF[sFsheet].max_row

    # open a new XL
    name = file.split('.')
    savename = name[0] + '_modified.xlsx'
    newXL = openpyxl.Workbook()
    sheets = newXL.sheetnames
    sheet = sheets[0]

    # copy straight from sF, except if you find X or x in column 2
    count = 1
    for row in sF[sFsheet].values:
        print(row)
        if count == 1:
            print('Inserting the headers ... ')
            for i in range(len(row)):
                newXL[sheet].cell(row=count, column=i+1).value=row[i]
            count +=1
            continue
        if row[1] == None:
            print('Found a None value ... ')
            continue
        elif 'X' in row[1] or 'x' in row[1]:
            print('Found an X or x in column 2 ... ')
            continue
        else:
            print('row is', row) 
            for i in range(len(row)):
                print('row[i] is', row[i])
                newXL[sheet].cell(row=count, column=i+1).value=row[i]
            count +=1

    newXL.save(path2XL + savename)
I hadn't got a chance until now to run the code. I get an error: -

    if 'X' in row[1].value or 'x' in row[1].value:
TypeError: argument of type 'int' is not iterable
Funny, it works for me.

In post #2 I should have saved sF, not newXL, or that will be empty. Actually, you don't need newXL at all in post #2

Below is the output in Idle for post #4
Output:
>>> myApp() ('id', 'name', 'Age') Inserting the headers ... (1, 'Peter', 1) row is (1, 'Peter', 1) row[i] is 1 row[i] is Peter row[i] is 1 (2, 'Xavier', 2) Found an X or x in column 2 ... (3, 'George', 3) row is (3, 'George', 3) row[i] is 3 row[i] is George row[i] is 3 (4, None, None) Found a None value ... (5, 'Peggy-Sue', 4) row is (5, 'Peggy-Sue', 4) row[i] is 5 row[i] is Peggy-Sue row[i] is 4 (6, 'Xander', 5) Found an X or x in column 2 ... (7, 'Legolas', 6) row is (7, 'Legolas', 6) row[i] is 7 row[i] is Legolas row[i] is 6 (8, 'Happy', 7) row is (8, 'Happy', 7) row[i] is 8 row[i] is Happy row[i] is 7 (9, None, None) Found a None value ... (10, 'Fatty', 8) row is (10, 'Fatty', 8) row[i] is 10 row[i] is Fatty row[i] is 8 (11, 'Sox', 9) Found an X or x in column 2 ... (12, 'Justin', 10) row is (12, 'Justin', 10) row[i] is 12 row[i] is Justin row[i] is 10
I can't even get Pycharm to work, or I'd try it there!

I'm sure you'll find the problem!
OK thanks I will need to play with it and see why it's not working.
I changed (based on post #2)

if 'X' in row[1].value or 'x' in row[1].value:
to

if row[1].value == 'X' or row[1].value=='x':
and it works now.

Referencing post #4, I get an error on line (which I have changed, in line with I have said above)

elif row[1].value == 'x' or row[1].value=='x':
I get the error

AttributeError: 'str' object has no attribute 'value'
What am I doing wrong?
I don't know, sorry!

I copied and pasted Post #4 straight into Idle and it worked exactly as I hoped!

I didn't change anything.

Change it back and try again!

This is post #2 changed, because I didn't need a new Excel for post #2

def myApp():
    import openpyxl
    path2XL = '/home/pedro/myPython/openpyxl/xlsx_files/'
    file = 'some_X_or_x_in_column_2.xlsx'
    name = file.split('.')
    savename = name[0] + '_modified.xlsx'
 
    sF = openpyxl.load_workbook(path2XL + file)
    sFsheets = sF.sheetnames
    sFsheet = sFsheets[0]
    sFmaxRow = sF[sFsheet].max_row
 
    # before
    for row in sF[sFsheet].values:
        print(row)
 
    # you can choose the rows and columns you want to iterate
    # for row in sF[sFsheet].iter_rows(min_row=1, min_col=1, max_row=6, max_col=2):
    # look through column 2
    # if a name contains X or x delete that row
    # row[1] is the value in column 2 of the XL
    # if there is an X or x in a header name, the headers row will be gone! Fix that!
    for row in sF[sFsheet].iter_rows():
        if row[1].value == None:
            continue
        if 'X' in row[1].value or 'x' in row[1].value:
            sF[sFsheet].delete_rows(row[0].row)
         
    # after
    for row in sF[sFsheet].values:
        print(row)
 
    sF.save(path2XL + savename)
From this, you can see how .delete_rows works: it adds and empty row, then deletes, otherwise you would get and index error,
It may be that I am using PyCharm environment that is causing the issue. Sigh.
Pages: 1 2