Python Forum

Pages: 1 2

The following snippet of code deletes empty rows. How can I change the code such that a row is deleted if a value of 'x' appears in any row in column C?

for row in ws.iter_rows():
    if not any([cell.value for cell in row[1:]]):
        ws.delete_rows(row[0].row)

This Excel file contains some names containing X or x in column 2 and some None values in column 2 or 3

Before the Excel looks like this:

Output:>>> for row in sF[sFsheet].values:
    print(row)

    
('id', 'name', 'Age')
(1, 'Peter', 1)
(2, 'Xavier', 2)
(3, 'George', 3)
(4, None, None)
(5, 'Peggy-Sue', 4)
(6, 'Xander', 5)
(7, 'Legolas', 6)
(8, 'Happy', 7)
(9, None, None)
(10, 'Fatty', 8)
(11, 'Sox', 9)
(12, 'Justin', 10)

This deletes rows containing a name in which the letters X or x appear

def myApp():
    path2XL = '/home/pedro/myPython/openpyxl/xlsx_files/'
    file = 'some_X_or_x_in_column_2.xlsx'

    sF = openpyxl.load_workbook(path2XL + file)
    sFsheets = sF.sheetnames
    sFsheet = sFsheets[0]
    sFmaxRow = sF[sFsheet].max_row

    # open a new XL
    name = file.split('.')
    savename = name[0] + '_modified.xlsx'
    newXL = openpyxl.Workbook()
    sheets = newXL.sheetnames
    sheet = sheets[0]

    # before
    for row in sF[sFsheet].values:
        print(row)

    # you can choose the rows and columns you want to iterate
    # for row in sF[sFsheet].iter_rows(min_row=1, min_col=1, max_row=6, max_col=2):
    # look through column 2
    # if a name contains X or x delete that row
    # row[1] is the value in column 2 of the XL
    # if there is an X or x in a header name, the headers row will be gone! Fix that!
    for row in sF[sFsheet].iter_rows():
        if row[1].value == None:
            continue
        if 'X' in row[1].value or 'x' in row[1].value:
            sF[sFsheet].delete_rows(row[0].row)
        
    # after
    for row in sF[sFsheet].values:
        print(row)

    sF.save(path2XL + savename)

Fantastic, many thanks.

I find, in Python, there is almost always more than 1 way to do things.

This creates a new modified file, where names with X or x or None values are not copied.

def myApp():
    import openpyxl

    path2XL = '/home/pedro/myPython/openpyxl/xlsx_files/'
    file = 'some_X_or_x_in_column_2.xlsx'

    sF = openpyxl.load_workbook(path2XL + file)
    sFsheets = sF.sheetnames
    sFsheet = sFsheets[0]
    sFmaxRow = sF[sFsheet].max_row

    # open a new XL
    name = file.split('.')
    savename = name[0] + '_modified.xlsx'
    newXL = openpyxl.Workbook()
    sheets = newXL.sheetnames
    sheet = sheets[0]

    # copy straight from sF, except if you find X or x in column 2
    count = 1
    for row in sF[sFsheet].values:
        print(row)
        if count == 1:
            print('Inserting the headers ... ')
            for i in range(len(row)):
                newXL[sheet].cell(row=count, column=i+1).value=row[i]
            count +=1
            continue
        if row[1] == None:
            print('Found a None value ... ')
            continue
        elif 'X' in row[1] or 'x' in row[1]:
            print('Found an X or x in column 2 ... ')
            continue
        else:
            print('row is', row) 
            for i in range(len(row)):
                print('row[i] is', row[i])
                newXL[sheet].cell(row=count, column=i+1).value=row[i]
            count +=1

    newXL.save(path2XL + savename)

I hadn't got a chance until now to run the code. I get an error: -

    if 'X' in row[1].value or 'x' in row[1].value:
TypeError: argument of type 'int' is not iterable

Funny, it works for me.

In post #2 I should have saved sF, not newXL, or that will be empty. Actually, you don't need newXL at all in post #2

Below is the output in Idle for post #4

Output:>>> myApp()
('id', 'name', 'Age')
Inserting the headers ... 
(1, 'Peter', 1)
row is (1, 'Peter', 1)
row[i] is 1
row[i] is Peter
row[i] is 1
(2, 'Xavier', 2)
Found an X or x in column 2 ... 
(3, 'George', 3)
row is (3, 'George', 3)
row[i] is 3
row[i] is George
row[i] is 3
(4, None, None)
Found a None value ... 
(5, 'Peggy-Sue', 4)
row is (5, 'Peggy-Sue', 4)
row[i] is 5
row[i] is Peggy-Sue
row[i] is 4
(6, 'Xander', 5)
Found an X or x in column 2 ... 
(7, 'Legolas', 6)
row is (7, 'Legolas', 6)
row[i] is 7
row[i] is Legolas
row[i] is 6
(8, 'Happy', 7)
row is (8, 'Happy', 7)
row[i] is 8
row[i] is Happy
row[i] is 7
(9, None, None)
Found a None value ... 
(10, 'Fatty', 8)
row is (10, 'Fatty', 8)
row[i] is 10
row[i] is Fatty
row[i] is 8
(11, 'Sox', 9)
Found an X or x in column 2 ... 
(12, 'Justin', 10)
row is (12, 'Justin', 10)
row[i] is 12
row[i] is Justin
row[i] is 10

I can't even get Pycharm to work, or I'd try it there!

I'm sure you'll find the problem!

OK thanks I will need to play with it and see why it's not working.

I changed (based on post #2)

if 'X' in row[1].value or 'x' in row[1].value:

if row[1].value == 'X' or row[1].value=='x':

and it works now.

Referencing post #4, I get an error on line (which I have changed, in line with I have said above)

elif row[1].value == 'x' or row[1].value=='x':

I get the error

AttributeError: 'str' object has no attribute 'value'

What am I doing wrong?

I don't know, sorry!

I copied and pasted Post #4 straight into Idle and it worked exactly as I hoped!

I didn't change anything.

Change it back and try again!

This is post #2 changed, because I didn't need a new Excel for post #2

def myApp():
    import openpyxl
    path2XL = '/home/pedro/myPython/openpyxl/xlsx_files/'
    file = 'some_X_or_x_in_column_2.xlsx'
    name = file.split('.')
    savename = name[0] + '_modified.xlsx'
 
    sF = openpyxl.load_workbook(path2XL + file)
    sFsheets = sF.sheetnames
    sFsheet = sFsheets[0]
    sFmaxRow = sF[sFsheet].max_row
 
    # before
    for row in sF[sFsheet].values:
        print(row)
 
    # you can choose the rows and columns you want to iterate
    # for row in sF[sFsheet].iter_rows(min_row=1, min_col=1, max_row=6, max_col=2):
    # look through column 2
    # if a name contains X or x delete that row
    # row[1] is the value in column 2 of the XL
    # if there is an X or x in a header name, the headers row will be gone! Fix that!
    for row in sF[sFsheet].iter_rows():
        if row[1].value == None:
            continue
        if 'X' in row[1].value or 'x' in row[1].value:
            sF[sFsheet].delete_rows(row[0].row)
         
    # after
    for row in sF[sFsheet].values:
        print(row)
 
    sF.save(path2XL + savename)

From this, you can see how .delete_rows works: it adds and empty row, then deletes, otherwise you would get and index error,

It may be that I am using PyCharm environment that is causing the issue. Sigh.

Pages: 1 2

azizrasul

Pedroski55

azizrasul

Pedroski55

azizrasul

Pedroski55

azizrasul

azizrasul

Pedroski55

azizrasul