Python Forum

Full Version: 'Get closest value array in array of arrays.' follow up help.
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
I created a post a little while back where I needed a function findClosestArray(x, y) which took two arguments:
'x' - a single array of rgb values [35,76,97]
'y' - a 2d array full of rgb values [[33,76,90], [75, 97, 74], ...]
findClosestArray would take x and find the closest matching array in 'y'. So for the arrays I posted before 'findClosestArray' would return [33,76,90], because that is closest to [35,76,97].
Here is the code for that:
def find_closest(x, array2d):
    x_sum = sum(x)
    return min(array2d, key=lambda z: abs(sum(z) - x_sum))
I have been using this every since, and have thought nothing of it because it worked perfectly. I have pretty much come to the end of my project - I am now in the stage of doing some last extensive testing.
I didn't notice this before (since my range of images was about 50 and it is now 1000 and over), but actually, the image created by my script, in in black and white.

Take these two arrays:
[1]: [50, 67, 98] - mucky blue [2]: [98, 67, 50] - browny-orange. RGB set 2, is just the reverse of set 1. With the code I wanted, calling 'findClosestArray' should return a different value for each array, since they are TWO DIFFERENT COLOURS.
However, the way the code works is that it sums up all values and finds the closest matching one. That's where the problem lies:
although RGB set 2 is a different color to set 1, if they are summed up, they become the SAME VALUE. This means when they are compared to the 2d array of rgb values 'y', it will always return the same value even though it is a different colour.
As well as this, summing up rgb values is pretty much how you find the black and white version of an image.

The problem is all to do with the summing up of the values in the array. That means I need a way to compare 'x' to 'y' without (technically) doing fidClosestArray(sum(x), sum(y))

How can I do that?
Dream
This is an interesting problem.
Apparently you should compare the R, G and B values separately. I think you should add the absolute values of the difference of the two lists to be compared.
I tried to make something for you to do the trick, but it is not as elegant and pythonic as your solution. Perhaps you can beautify this function.
def find_closest(x, array2d):
    least_diff = 999
    least_diff_index = -1
    for num, elm in enumerate(array2d):
        diff = abs(x[0]-elm[0]) + abs(x[1]-elm[1]) + abs(x[2]-elm[2])
        if diff < least_diff:
            least_diff = diff
            least_diff_index = num
    return array2d[least_diff_index]

x = [35,76,97]
y = [[33,76,90], [75, 97, 74], [100, 102, 200]]
print(find_closest(x, y))

Output:
[33, 76, 90]
Thanks for the reply - the way you have done it is quite intriguing.
I sort of had a similar idea, but it was very fragmented.
This is as far as I got in the time I had:
def find_closest2(col, array2d):
  r,g,b = col[0], col[1], col[2]
  first_el = [i[0] for i in array2d]
  second_el = [i[1] for i in array2d]
  third_el = [i[2] for i in array2d]
which is almost exactly what you did in this line:
for num, elm in enumerate(array2d):
.
Dream

I've worked some magic and compressed the function to a single line:
def find_closest2(col, array2d):
    get_lowest = lambda rgb_vals: min([[abs(col[0]-elm[0]) + abs(col[1]-elm[1]) + abs(col[2]-elm[2]), num] for num, elm in enumerate(rgb_vals)], key=lambda x: x[0])[1]
    return array2d[get_lowest(array2d)]
It is difficult to tell if it is working, however, it doesn't produce errors so that's a good sign.
I am going to test it out in my full project because that will really make sure it is working correctly.
This is not only an interesting problem but also a difficult one:
Have a good read about this matter here.
I has definitely worked -
Old code:
[Image: ixZ8M9O.jpg]
New code:
[Image: G7p38Nh.png]
There is a pretty big performance difference though.
(Dec-01-2019, 03:38 PM)ThomasL Wrote: [ -> ]This is not only an interesting problem but also a difficult one:
Have a good read about this matter here.
The post you provided is actually quite interesting. I feel like the YUV colour space could actually be quite useful for this - for both what I am trying to do - and possible speed wise as well.
Maybe I something really misunderstand, but why are you not using NumPy? Native loops or list comprehensions are very slow in Python.

You can easily achieve what you want using numpy:

import numpy as np 

def find_closest(color, img):
    return img[np.argmin(np.abs(img - color).sum(axis=1))]
# probably you would use, e.g. np.linalg.norm(... ) instead of np.abs

# img.shape = (xxxx, 3);  color.shape = (3, )

def find_closest_yuv(color, img):
    yuv_matrix = np.array([[put coefs here...],  # e.g. from SO post (cited above)
                           [put coefs here ], ...])
    return img[np.argmin(np.abs(img @ yuv_matrix.T - color @ yuv_matrix.T).sum(axis=1))]
(Dec-03-2019, 12:39 AM)scidam Wrote: [ -> ]Maybe I something really misunderstand, but why are you not using NumPy? Native loops or list comprehensions are very slow in Python.

You can easily achieve what you want using numpy:

[python]import numpy as np

def find_closest(color, img):
return img[np.argmin(np.abs(img - color).sum(axis=1))]
# probably you would use, e.g. np.linalg.norm(... ) instead of np.abs
The second parameter of 'find_closest' is a 2d array so the setup you have now isn't going to work.
Error:
TypeError: unsupported operand type(s) for -: 'list' and 'list'
As for the reason I didn't pick numpy is mainly that I haven't really used it before and so don't know much about it.
Of course, it definitely is going to be a beter choice since the arrays that I have are 1000s/10,00s of elements long - it's whether I can actually do it in numpy.
This is because img and color are expected to be numpy arrays. The following should be working example.

import numpy as np

# find_closest is defined above

# 100x3  matrix filled with random ints
img = np.random.randint(1, 255, size=(100, 3))                                                                                                                                                                                                                        

# sample color
color = np.array([20, 50, 12])                                                                                                                                                                                                                                        
find_closest(color, img)  
You can convert numpy arrays img and color to lists by applying .tolist() method, e.g. img.tolist().
You can convert Python list to numpy array by applying np.array(your_python_list).
However, there is no practical reason to use pure Python loops and lists because they are slow. When executed img - color and img and color are numpy arrays, numpy automatically "up-scales" color var to the shape of img variable and performs element-wise "minus". No Python loops are used here, element-wise operations are performed in C.
(Dec-04-2019, 12:01 AM)scidam Wrote: [ -> ]This is because img and color are expected to be numpy arrays. The following should be working example.
Right, ok, that makes more sense.
(Dec-04-2019, 12:01 AM)scidam Wrote: [ -> ]performs element-wise "minus"
In that case, although I don't need this code, this could be improved using numpy right?
def find_closest(x, array2d):
    x_sum = sum(x)
    return min(array2d, key=lambda z: abs(sum(z) - x_sum))
(Dec-04-2019, 04:03 PM)DreamingInsanity Wrote: [ -> ]this could be improved using numpy right?
Yes, Numpy version of your function find_closest should work faster.

def find_closest(x, array2d):
    return array2d[np.argmin(np.abs(array2d.sum(axis=1) - x.sum()))]
Pages: 1 2