Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 How to rewrite image file name based on ocr data.txt
#1
Hey,

I'm pretty new to python and still learning everyday. I'm kind of stuck for what i want to accomplish with my python script.

What i'm trying to do is ocr images in batch and then save that data into data.txt and after than i would like to rewrite the images with the ocr data...

so for example i have this image named 'dog-mask.jpg' and after ocr has been over it i would like to rewrite the image filename into this for example to 'no one cared who I was until I put on the mask.jpg'

İmage


The ocr part seems to work fine but i have no idea how to set new image files names with data from data.txt

Could anyone help me out please, i would really appreciate it if it is not too much trouble

Below is the code of my ocr script


import pytesseract
import os
from PIL import Image
import re

pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe' # path of tesseract

path = 'C:\\users\\kevin\\downloads\\downloads' # path of image folder

# function to convert image to text and return type: string
def ocr(file_to_ocr):
    im = Image.open(path+"\\"+file_to_ocr)
    txt=pytesseract.image_to_string(im)
    return txt

file_list = os.listdir(path) # file names in list (not sorted)
directory = os.path.join(path) # path for storing the text file

# function to sort the file names in order of numerical value present in it
def atoi(text):
    return int(text) if text.isdigit() else text

def natural_keys(text):
    '''
    alist.sort(key=natural_keys) sorts in human order
    http://nedbatchelder.com/blog/200712/human_sorting.html
    (See Toothy's implementation in the comments)
    '''
    return [ atoi(c) for c in re.split('(\d+)', text) ]

file_list.sort(key=natural_keys) # file names in list (sorted)

# for every files in the folder
for file in file_list:
	# selecting image file type
    if file.endswith(".jpg"):
        txt=ocr(file) # calling the ocr function
	# appending the text into the file
        with open(directory+"\\"+'data'+".txt",'a+') as f:
            f.write("\n")
            f.write(file)
            f.write("\n")
            f.write('-----------------------------------------')
            f.write("\n")
            f.write('!!!Start!!!')
            f.write("\n")
            f.write(str(txt))
            f.write("\n")
            f.write('!!!End!!!')
            f.write("\n")
            f.write('-----------------------------------------')
            f.write("\n")
print("Image Conversion completed")
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Create a csv file that pulls data from an .accdb file T0ky0banana 0 102 Dec-20-2019, 01:32 AM
Last Post: T0ky0banana
  How to sort image files according to a metadata file? Brahmslove 1 191 Dec-05-2019, 11:25 PM
Last Post: scidam
  Cleaner way to rewrite fakka 5 231 Dec-05-2019, 04:53 AM
Last Post: stullis
  Python Based Data QA Automation tool suggestion Sonia567 1 106 Nov-19-2019, 04:46 PM
Last Post: Larz60+
  How do I rewrite this .bat file code onto Python? SteampunkMaverick12 4 202 Nov-02-2019, 11:28 PM
Last Post: snippsat
  Split csv file based on column value soli004 4 520 Oct-22-2019, 05:53 AM
Last Post: soli004
  Trying to make column based file from text file scor1pion 7 437 Jul-16-2019, 02:43 PM
Last Post: scor1pion
  write image into string format into text file venkat18 2 524 Jun-01-2019, 06:46 AM
Last Post: venkat18
  How to ger matching rows data based on index columns SriRajesh 1 364 Mar-08-2019, 11:05 AM
Last Post: scidam
  Rewrite a function to make it work with 'bottle-pymysql' nikos 1 314 Feb-26-2019, 02:59 PM
Last Post: nikos

Forum Jump:


Users browsing this thread: 1 Guest(s)