Please help me speed up my script :(

mikeak2001 · Mar-26-2020, 10:33 PM

Hi,

I recently asked for help in regards to CRC16-CCITT/FALSE checksums.
I had some great advice to try CRCengine.

https://python-forum.io/Thread-Read-micr...-checksums

I have put together a python script that reads in a binary file and adds a byte one at a time then checks the resulting checksum and matches it to see if the next two bytes are the checksum.

Can anyone give me some pointers of speeding up this script. I am reading a 2048kb file and calculating all data blocks with their checksums. However it is so so slow. Running the script on the file for 24hrs and the script only read through 23% of the file, I don't really want to be running the computer for 4 days on each file if I can cut down the time.
My coding is as below:

#!/usr/bin/python
 
import os
import sys
import crcengine
 
def progress(count, total, suffix=''):
    bar_len = 60
    filled_len = int(round(bar_len * count / float(total)))
 
    percents = round(100.0 * count / float(total), 1)
    bar = '=' * filled_len + '-' * (bar_len - filled_len)
 
    sys.stdout.write('[%s] %s%s ...%s\r' % (bar, percents, '%', suffix))
    sys.stdout.flush()
 
 
def main(argv):
    inputfile = sys.argv[1]
    inputfilesize = os.path.getsize(sys.argv[1])
    outputfile = sys.argv[1]+'.txt'
    woutputfiledata = open(outputfile, 'a+')
 
    print ('Input file is:' + inputfile)
    print ('Finding checksums...')
 
    f1 = open(inputfile, "rb")      # open argv[1] file for binary reading
                                    #
    block = 1                       # used to keep track of found blocks - increments
    previousPos = 0                 # starts at zero - increments by amount of bytes found in each block
    currentPos = 0                  # current byte position within the file
    bytelist = []                   # bytes currently read
    message = b''                   # hex string passed to the crcengine function
    chkmessage = b''                # calculated checksum of bytelist and message used to see if it equal to the last two bytes
    CRCcheck = ['00','00']          # last two bytes of bytelist - compared to the checksum of bytelist[-2]
    CRCcheckMessage = 'CRC=0x'          # CRCcheckMessage to compare to chkmessage
    crc_algorithm = crcengine.new('crc16-ccitt-false')  # set up crc algorithm
    successcounter = 0
     
    while 1:
        byte_s = f1.read(1)     # read 1 byte from the file
        if not byte_s:          # if no bytes read quit
            f1.close()
            break
             
        bytelist.append(byte_s)  # add read byte to bytelist
         
        while len(bytelist) < 3:     # check theres at least 3 bytes
            byte_s = f1.read(1)     # last two are used as the checksum comparitor
            currentPos +=1          # increase currentPos to keep track of bytes read and position within file
            bytelist.append(byte_s) # then add the byte to bytelist
             
        CRCcheck[0] = bytelist[-2]  # add last but one byte read to the first position of the checksum list
        CRCcheck[1] = bytelist[-1]  # add last bite read to the second position of the checksum list
        message = message + bytelist[-3]    # add all bytes bar the last two to the byte string
        result = crc_algorithm(message)     # send the byte string and store the returned checksum int into result
        chkmessage ='CRC=0x{:04x}'.format(result)   # convert checksum message into a 2 byte hex string
         
        CRCcheckMessage = CRCcheckMessage + CRCcheck[0].hex()
        CRCcheckMessage = CRCcheckMessage + CRCcheck[1].hex()
         
        if CRCcheckMessage == chkmessage:
            woutputfiledata.write('BLK: '+ str(block)+'\n')
            woutputfiledata.write('SOB: '+ '0x{:08x}'.format(previousPos)+'\n')
            woutputfiledata.write('EOB: '+ '0x{:08x}'.format(currentPos-2)+'\n')
            woutputfiledata.write('CKS: '+ chkmessage+'\n')
            woutputfiledata.write('\n')
            block += 1
            successcounter += 1
            previousPos = currentPos + 1
            bytelist.clear()
             
        CRCcheckMessage = 'CRC=0x'
        currentPos +=1
        progress(currentPos, inputfilesize)
 
if __name__ == "__main__":
    main(sys.argv[1:])

I will take any criticism you can throw at me, I haven't coded anything for years. This I should take as a lesson to keep on top of my coding.

I will only ever have to read a file once, when I have all the checksums I will be able to just mod a particular block and recalculate that blocks checksum.

The script ouputs to a txt file like below:

BLK: 1
SOB: 0x00000000
EOB: 0x000084fb
CKS: CRC=0x1a49

BLK: 2
SOB: 0x000084fe
EOB: 0x00039b34
CKS: CRC=0x4444

BLK: 3
SOB: 0x00039b37
EOB: 0x0006d4cb
CKS: CRC=0xffff

BLK: 4
SOB: 0x0006d4ce
EOB: 0x000754cc
CKS: CRC=0xffff

In between each block you will notice 2 bytes missing, this would be the location for the checksum of the previous block.
This txt file output above is from the last file I scanned using this script. It appears to be working from manually checking the file locations. I just need to be able to run through the file quicker.

Thanks in advance.

Please help me speed up my script :(

User Panel Messages

Announcements