Python Forum
How to rewrite image file name based on ocr data.txt
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to rewrite image file name based on ocr data.txt
#1
Hey,

I'm pretty new to python and still learning everyday. I'm kind of stuck for what i want to accomplish with my python script.

What i'm trying to do is ocr images in batch and then save that data into data.txt and after than i would like to rewrite the images with the ocr data...

so for example i have this image named 'dog-mask.jpg' and after ocr has been over it i would like to rewrite the image filename into this for example to 'no one cared who I was until I put on the mask.jpg'

[Image: No-One-Cared-Who-I-Was-Until-I-Put-On-The-Mask.jpg]

The ocr part seems to work fine but i have no idea how to set new image files names with data from data.txt

Could anyone help me out please, i would really appreciate it if it is not too much trouble

Below is the code of my ocr script


import pytesseract
import os
from PIL import Image
import re

pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe' # path of tesseract

path = 'C:\\users\\kevin\\downloads\\downloads' # path of image folder

# function to convert image to text and return type: string
def ocr(file_to_ocr):
    im = Image.open(path+"\\"+file_to_ocr)
    txt=pytesseract.image_to_string(im)
    return txt

file_list = os.listdir(path) # file names in list (not sorted)
directory = os.path.join(path) # path for storing the text file

# function to sort the file names in order of numerical value present in it
def atoi(text):
    return int(text) if text.isdigit() else text

def natural_keys(text):
    '''
    alist.sort(key=natural_keys) sorts in human order
    http://nedbatchelder.com/blog/200712/human_sorting.html
    (See Toothy's implementation in the comments)
    '''
    return [ atoi(c) for c in re.split('(\d+)', text) ]

file_list.sort(key=natural_keys) # file names in list (sorted)

# for every files in the folder
for file in file_list:
	# selecting image file type
    if file.endswith(".jpg"):
        txt=ocr(file) # calling the ocr function
	# appending the text into the file
        with open(directory+"\\"+'data'+".txt",'a+') as f:
            f.write("\n")
            f.write(file)
            f.write("\n")
            f.write('-----------------------------------------')
            f.write("\n")
            f.write('!!!Start!!!')
            f.write("\n")
            f.write(str(txt))
            f.write("\n")
            f.write('!!!End!!!')
            f.write("\n")
            f.write('-----------------------------------------')
            f.write("\n")
print("Image Conversion completed")
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Copy Paste excel files based on the first letters of the file name Viento 2 346 Feb-07-2024, 12:24 PM
Last Post: Viento
  error "cannot identify image file" part way through running hatflyer 0 612 Nov-02-2023, 11:45 PM
Last Post: hatflyer
  Grouping Data based on 30% bracket purnima1 4 1,142 Mar-10-2023, 07:38 PM
Last Post: deanhystad
  Split pdf in pypdf based upon file regex standenman 1 1,974 Feb-03-2023, 12:01 PM
Last Post: SpongeB0B
  is it possible to copy image from email and place into excel file? cubangt 3 1,212 Nov-30-2022, 05:11 PM
Last Post: snippsat
  New2Python: Help with Importing/Mapping Image Src to Image Code in File CluelessITguy 0 698 Nov-17-2022, 04:46 PM
Last Post: CluelessITguy
  Unable to request image from FORM Data usman 0 968 Aug-18-2022, 06:23 PM
Last Post: usman
  conditionals based on data frame mbrown009 1 873 Aug-12-2022, 08:18 AM
Last Post: Larz60+
  I have written a program that outputs data based on GPS signal kalle 1 1,127 Jul-22-2022, 12:10 AM
Last Post: mcmxl22
Question Change elements of array based on position of input data Cola_Reb 6 2,062 May-13-2022, 12:57 PM
Last Post: Cola_Reb

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020