How to rewrite image file name based on ocr data.txt - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: How to rewrite image file name based on ocr data.txt (/thread-9569.html) |
How to rewrite image file name based on ocr data.txt - kevinchr - Apr-16-2018 Hey, I'm pretty new to python and still learning everyday. I'm kind of stuck for what i want to accomplish with my python script. What i'm trying to do is ocr images in batch and then save that data into data.txt and after than i would like to rewrite the images with the ocr data... so for example i have this image named 'dog-mask.jpg' and after ocr has been over it i would like to rewrite the image filename into this for example to 'no one cared who I was until I put on the mask.jpg' The ocr part seems to work fine but i have no idea how to set new image files names with data from data.txt Could anyone help me out please, i would really appreciate it if it is not too much trouble Below is the code of my ocr script import pytesseract import os from PIL import Image import re pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe' # path of tesseract path = 'C:\\users\\kevin\\downloads\\downloads' # path of image folder # function to convert image to text and return type: string def ocr(file_to_ocr): im = Image.open(path+"\\"+file_to_ocr) txt=pytesseract.image_to_string(im) return txt file_list = os.listdir(path) # file names in list (not sorted) directory = os.path.join(path) # path for storing the text file # function to sort the file names in order of numerical value present in it def atoi(text): return int(text) if text.isdigit() else text def natural_keys(text): ''' alist.sort(key=natural_keys) sorts in human order http://nedbatchelder.com/blog/200712/human_sorting.html (See Toothy's implementation in the comments) ''' return [ atoi(c) for c in re.split('(\d+)', text) ] file_list.sort(key=natural_keys) # file names in list (sorted) # for every files in the folder for file in file_list: # selecting image file type if file.endswith(".jpg"): txt=ocr(file) # calling the ocr function # appending the text into the file with open(directory+"\\"+'data'+".txt",'a+') as f: f.write("\n") f.write(file) f.write("\n") f.write('-----------------------------------------') f.write("\n") f.write('!!!Start!!!') f.write("\n") f.write(str(txt)) f.write("\n") f.write('!!!End!!!') f.write("\n") f.write('-----------------------------------------') f.write("\n") print("Image Conversion completed") |