Python Forum

Full Version: Need help creating complex loop around existing script
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi all,

I have a script used to draw rectangles over multiple features within an image, based upon the pre-extracted x/y/r pixel coordinates of said features. This script is as follows:

import os
import numpy as np
import pandas as pd
import PIL
from PIL import Image
from PIL import ImageDraw
import glob
import re                       # import RegEx module

xeno_data = 'C:\file_path\data.xlsx'
ss = pd.read_excel(xeno_data)
These three pixel coord values (x/y/r) are all contained within the same column in the Excel spreadsheet, hence the following step to seperate them into seperate numbers:

# Extract filenames and coordinates from Excel spreadsheet (data.xlsx):
FandC = []
for index, row in ss.iterrows():
   filename = row['filename']
   coords   = row['xyr_coords']
       # Use RegEx to find anything that looks like a group of digits, possibly seperated by decimal point:
   x, y, r = re.findall(r'[0-9.]+',coords)
   print(f'DEBUG: filename={filename}, x={x}, y={y}, r={r}')
   FandC.append({'filename': filename, 'x':x, 'y':y, 'r':r})
Creating new dataframes for the values of interest - 'filename', 'x', 'y', and 'r':

fandc=pd.DataFrame(FandC)
    #creates a dataframe for "filename", "x", "y", and "r".

fandc['filename'] [fandc['filename']=='Image1.jpg']
    # shows "fandc['filename']" where the "filename" is equal to (==) the string "Image1.jpg".

fandc_f = fandc[fandc['filename']=='Image1.jpg']
    # create new df called "fandc_f" that only includes "fandc['filename']" where "filename" == "Image1.jpg".

for index , row in fandc_f.iterrows():
    row
    break
Draw a transparent rectangle:

im = im.convert('RGBA')
overlay = Image.new('RGBA', im.size)
draw = ImageDraw.Draw(overlay)

for index, row in fandc_f.iterrows():
    for i in range(len(fandc_f)):
        draw.rectangle(((float(row['x'])-float(row['r']), float(row['y'])-float(row['r'])), (float(row['x'])+float(row['r']), float(row['y'])+float(row['r']))), fill=(255,0,0,55))

# Remove alpha for saving in jpg format:        
img = Image.alpha_composite(im, overlay)
img = img.convert("RGB")
This code is working perfectly... for one image. However, now I need to wrap the whole process in with a big loop on the outside that automatically goes through each unique filenames in a pandas df.

For each one I need it to:
  • load the image,
  • filter the df for respective filenames,
  • apply my rectangle drawing loop,
  • save the image,
  • and then loop on to next image automatically (as I have thousands of images to process, and cannot do it manually).

If any clarification is needed ask away! Big Grin

Thanks, Rhod
If you take what you have and write it as a function, repeating the processing over and over for each image should be simple. But you already know that, so what is the difficult part that I am missing?
Im simply struggling to convert this into a function. I agree it should be simple and I understand the theory, but I'm clearly doing something wrong as I cant get it to work. Could you please help show/explain to me how it's done?

I am just a humble ecology student stumbling around a programmers world lol. I have written much of this with my supervisor, but now in light of the lockdown I am struggling to work as effectively from home without his advisement. Moreover, he has just taken a large block of annual leave without giving me a heads up, so I can't contact him.

Thnx
Is the problem that it only works for "Image1.jpg" and that you need a way for it to work for each filename? You still have the filenames in FandC (btw, using fandc for the dataframe variable is really confusing). You can use FandC to get the image filename.
master_df=pd.DataFrame(FandC)
    #creates a dataframe for "filename", "x", "y", and "r".

# for each filename, get the dataframe and process
for img in FandC:
    img_filename = img.['filename']
    img_df = master_df[master_df['filename'] == img_filename]
I agree it is confusing! I have renamed the df "master_df" as advised.

I have tried to implement the for loop suggested above on my script, but am not entirely sure where to intergrate it in, and as a result keep getting the classic "invalid syntax" error when I try run the script. Where in my script above would you intergrate this for loop in?

Sorry for being so useless :S
Seeing your code would be helpful. All of it, not bits and pieces.