Python Forum
deleting certain rows from multidimensional list
Thread Rating:
  • 2 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
deleting certain rows from multidimensional list
#1


hello,

i am an half beginner with python 3.x
i uploaded a csv file ( here there is an example if someone wants to try) that contain coordinates and i converted it from string to float values, until here is simple.

data = list()

import csv
with open('data.csv', newline='') as csvfile:
    file = csv.reader(csvfile, delimiter=';', quotechar='"')
    for row in file:
        data.append(row[1:3])
        
    del data[0]#delete header
    
    for inner in data:
        for index, string in enumerate(inner):
            inner[index] = float(string)
            

    #print (data)
then i would like to delete the rows that contain a value under a certain amount in one of the columns. how could i do it? i am facing a lot of difficulties with multi dimensional array Doh

i tried for loops but i didn't manage to have it working
for row in data:
        print ("row: ", row)
        for element in row:
            print("ele: ", element)
Reply
#2
Your sample data:

csv_string='''type;latitude;longitude;speed
T;37.547575;15.143184;0.963040
T;37.547569;15.143185;0.833400
T;37.547565;15.143186;0.351880
T;37.547561;15.143194;0.629680
T;37.547560;15.143205;1.129720
T;37.547561;15.143222;1.259360
T;37.547562;15.143236;1.129720'''
The order of steps in your program:
  1. Read a line
  2. Convert data of the line
  3. Make decision if data is in boundaries
  4. If data is in boundaries -> yield data


Just as an example, don't use this code:
import csv
from functools import partial
import io


def in_boundaries(lat, lon, lat_min, lat_max, lon_min, lon_max):
    return lat_min < lat < lat_max and lon_min < lon < lon_max 

# your boundaries
boundaries = {
    'lat_min': 37.547560,
    'lat_max': 37.547570,
    'lon_min': 15.143184,
    'lon_max': 15.143186,
    }

# functional approach, using keyword unpacking
my_boundaries = partial(in_boundaries, **boundaries)


csv_string = '''type;latitude;longitude;speed
T;37.547575;15.143184;0.963040
T;37.547569;15.143185;0.833400
T;37.547565;15.143186;0.351880
T;37.547561;15.143194;0.629680
T;37.547560;15.143205;1.129720
T;37.547561;15.143222;1.259360
T;37.547562;15.143236;1.129720'''

file_like_csv = io.StringIO(csv_string)

reader = csv.reader(file_like_csv, delimiter=';')
header = next(reader)
#print(header)

for row in reader:
    # the part where you split your row into columns
    try:
        type, latitude, longitude, speed = row
    except ValueError:
        print('Invalid column length:', row)
        continue
    try:
        # convert lat, lon to float
        latitude, longitude = map(float, [latitude, longitude])
    except ValueError:
        print('Wrong value in:', row)
        continue
    # now decide if your data is in your defined boundary
    if my_boundaries(latitude, longitude):
        print(row)
        # here you should do something with your valid data
    else:
        # you don't need this.
        # this is just for control
        # print('Not in boundary:', row)
        pass
At the end it's very easy. Make a function which decides if your data is in your boundaries and do something with it.
No need to make this strange nested iteration.

What you need to know:
  • Exception handling
  • argument unpacking
  • keyword argument unpacking, if you're working with dicts. Not necessary.
  • functions
  • use of csv module
  • knowledge of iterables, csv.reader is an iterator
  • comparison with more than one value (min_val < val < max_val)
  • logical operations: and, or, not
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#3
This might help (delete row or column if total is less than given amount):
import random

#populate 2d grid 10tall 15wide with random numbers 0-9
grid=[]
for y in range(10):
    row=[]
    for x in range(15):
        row.append(random.randint(0,9))
    grid.append(row)

#display grid
for row in grid:
    print row

#delete rows that sums to <60 (iterate backwards cuz deleting stuff while iterating forwards skips ahead of deleted entry)
for y in range(10-1,-1,-1):
    total=sum(grid[y])
    print total
    if total<60:
        del grid[y]

#display grid
for row in grid:
    print row

#delete columns that sum to <40
for x in range(15-1,-1,-1):
    total=0
    for y in range(len(grid)):
        total+=grid[y][x]
    print total
    if total<40:
        for y in range(len(grid)):
            del grid[y][x]

#display grid
for row in grid:
    print row
Reply
#4
(Nov-02-2017, 04:00 PM)DeaD_EyE Wrote: The order of steps in your program:
  1. Read a line
  2. Convert data of the line
  3. Make decision if data is in boundaries
  4. If data is in boundaries -> yield data


At the end it's very easy. Make a function which decides if your data is in your boundaries and do something with it.
No need to make this strange nested iteration.

What you need to know:
  • Exception handling
  • argument unpacking
  • keyword argument unpacking, if you're working with dicts. Not necessary.
  • functions
  • use of csv module
  • knowledge of iterables, csv.reader is an iterator
  • comparison with more than one value (min_val < val < max_val)
  • logical operations: and, or, not

DeaD_EyE, i really love your coding style and the simplicity and comment you used to let me understand everythings.
I losted 1 hour trying to make the nested itineration working while you did it in a simpler and nicer way. I think i am still using a C style way and in python i shouldn't use the same approach i am used to Snooty
i really liked how did you manage the errors, as an hobbyist i am not used to this but i know that good programmers and code manage all the possibilities

the map function is very powerful, i didn't know about it (well in C it is something very different Shifty ) i will use it in the future for sure

the only thing i didn't understand well, even if i read the doc, was the partial(), i found it so confusing that i decided to remove it and made it more readeable for myself:

lat_min= 37.547550
lat_max= 37.547570
lon_min= 15.143184
lon_max= 15.143196
    
def in_boundaries(lat, lon):
    return lat_min < lat < lat_max and lon_min < lon < lon_max 
  
i know this is not good as your but it works the same and for me is sympler Angel

using your code i just had to append() the good value in a new list and i have my new database ready to be written in a new csv file

again I say you a big thank you Smile
Reply
#5
Thanks :-)

partial is a function, that takes as first argument a function and all following arguments are the applied arguments to the function. Keyword arguments are also applied to the function. It returns the partial function. Later you can call this function with the rest arguments/keyword arguments. The double star in front of a name in a function call is keyword argument unpacking.

A small example:

def foo(a, b, c, d):
    return a, b, c, d

f = partial(foo, b=2, c=3, d=4)
print(f(1))
Output:
(1, 2, 3, 4)
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Deleting rows based on cell value in Excel azizrasul 11 2,474 Oct-19-2022, 02:38 AM
Last Post: azizrasul
  The code I have written removes the desired number of rows, but wrong rows Jdesi1983 0 1,599 Dec-08-2021, 04:42 AM
Last Post: Jdesi1983
  deleting select items from a list Skaperen 13 4,389 Oct-11-2021, 01:02 AM
Last Post: Skaperen
  Pandas DataFrame combine rows by column value, where Date Rows are NULL rhat398 0 2,082 May-04-2021, 10:51 PM
Last Post: rhat398
  Indexing [::-1] to Reverse ALL 2D Array Rows, ALL 3D, 4D Array Columns & Rows Python Jeremy7 8 6,958 Mar-02-2021, 01:54 AM
Last Post: Jeremy7
  Deleting employee from list SephMon 3 3,199 Jan-05-2021, 04:15 AM
Last Post: deanhystad
  Counting Element in Multidimensional List quest_ 1 2,065 Nov-25-2020, 10:00 PM
Last Post: quest_
  Choose an element from multidimensional array quest_ 2 2,576 Nov-25-2020, 12:59 AM
Last Post: quest_
  Jelp with a multidimensional loop Formationgrowthhacking 1 1,821 Jan-27-2020, 10:05 PM
Last Post: micseydel
  Sort MULTIDIMENSIONAL Dictionary mirinda 2 4,832 Apr-05-2019, 12:08 PM
Last Post: perfringo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020