Python Forum
Byte string catenation inefficient in 3.7?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Byte string catenation inefficient in 3.7?
#8
Numpy: Vaguely aware, not familiar.
Why strings? The bad answer is simply because that’s how it is done in the code I found when I searched on how to write a bmp file.

Sounds like I should check Numpy? Does it include functions to convert to string for an xxx.write(...). Or dies it have equivalent output function?

Ok here is all the sordid detail. I will be pasting below
1. Python27 Code
2. Python27 Output
3. Python37 Code
4. Python37 Output

The two programs are more or less identical. There are some instructions comments (just a few!). The only difference in the versions is the initialisation of the pixel string (and a padding string) as either char or byte.

You can edit the file names if you want to verify for yourself that the output file is correct, or comment that out the file open and write as you like.
The output logs speak for themselves!

On my machine the 3.7 version uses about 40 MB of RAM (out of 16 GB) and about 12% of CPU.

Here are the graphs of the elapsed time as the string builds:
https://gyazo.com/33d110b55689358d9e2331f9cd99977e


1. Python 2.7 Code
# Testing struct.pack and string catenation in Python2 and 3
# This is a demo cut down from real app (which draws charts from survey data)
# creates a 'square rainbow' bmp file
## edit for Py 2 (char strings) or 3 (byte strings) versions

# edit these for your set up and test
Size = 1024    # test image size, pixels
path = 'D:/Python27/MyScripts/Test/'  # for the bmp file

import csv
import os
import struct
from math import trunc, ceil, floor
import time


def BuildImage(name, XY):
    # name : filename
    # XY : (width, height) pixels

    # for stats and timing
    n0 = 0
    t00 = time.clock()
    t01 = t00
    
    chtName = path+'cht_'+name+'.bmp'
    print("drawing "+chtName)
    
    hdr = bmpHdr(XY)
    #print(hdr)
    
    pixels =''  ##  Py2
    ##pixels = bytes('', 'utf-8')  ## Py3
    
    for Y in range(0, XY[1]): # (BMPs are L to R from the bottom L row)
        for X in range(0, XY[0]):
            # square rainbow for time tests -  as oposed to real data
            x = floor((255 * X)/XY[0])
            y = floor((255 * Y)/XY[1])
            (r,g,b) = [x, y, 128]   #Colour(data[x ,y])
            pixels += struct.pack('<BBB',b,g,r)
            
        row_mod = (hdr['width']*hdr['colordepth']/8) % 4
        if row_mod == 0:
            padding = 0 
        else:
            padding = (4 - row_mod)
        padbytes = ''  #  P2
        #padbytes = bytes('', 'utf-8')  # P3
        for i in range(padding):
            padbytes += struct.pack('<B',0)
        pixels = pixels + padbytes

        # stats log
        if(0 == Y % 100 or Y == 0):
            n = len(pixels)
            t02 = time.clock()
            log = "{0:5d} L={1:8,d}, delta={2:7,d}, pad={3:4d}".format(XY[0]-Y, n, n-n0, padding)
            log += ", time = {0:6.3f}, cum = {1:7.3f}".format(t02-t01, t02-t00)
            print(log)
            t01 = t02
            n0 = n
    
    print("pixels generated, len = "+str(len(pixels)))
    bmp_write(chtName, hdr, pixels)
    

def bmpHdr(XY):
    print("bmphdr xy "+str(XY))
    hdr = {
        'mn1':66,
        'mn2':77,
        'filesize':0,
        'undef1':0,
        'undef2':0,
        'offset':54,
        'headerlength':40,
        'width':XY[0],   #256
        'height':XY[1],  #256
        'colorplanes':0,
        'colordepth':24,
        'compression':0,
        'imagesize':0,
        'res_hor':0,
        'res_vert':0,
        'palette':0,
        'importantcolors':0
        }
    return hdr


#Function to write a bmp file.  It takes a dictionary (hdr) of
#header values and the pixel data (pixels) and writes them
#to a file.  This function is called at the bottom of the code.
def bmp_write(name, hdr, pixels):
    print('making bmp with '+str(len(pixels))+" pixels")
    mn1 = struct.pack('<B',hdr['mn1'])
    mn2 = struct.pack('<B',hdr['mn2'])
    filesize = struct.pack('<L',hdr['filesize'])
    undef1 = struct.pack('<H',hdr['undef1'])
    undef2 = struct.pack('<H',hdr['undef2'])
    offset = struct.pack('<L',hdr['offset'])
    headerlength = struct.pack('<L',hdr['headerlength'])
    width = struct.pack('<L',hdr['width'])
    height = struct.pack('<L',hdr['height'])
    colorplanes = struct.pack('<H',hdr['colorplanes'])
    colordepth = struct.pack('<H',hdr['colordepth'])
    compression = struct.pack('<L',hdr['compression'])
    imagesize = struct.pack('<L',hdr['imagesize'])
    res_hor = struct.pack('<L',hdr['res_hor'])
    res_vert = struct.pack('<L',hdr['res_vert'])
    palette = struct.pack('<L',hdr['palette'])
    importantcolors = struct.pack('<L',hdr['importantcolors'])
    #create the outfile
    outfile = open(name,'wb')   # 'bitmap_image.bmp'
    #write the header + the_bytes
    hdr = mn1+mn2
    hdr += filesize+undef1+undef2
    hdr += offset+headerlength+width+height
    hdr += colorplanes+colordepth+compression+imagesize+res_hor+res_vert
    hdr += palette+importantcolors
    print("headers = "+str(hdr))
    bmp = hdr + pixels
    print('writing bmp, len = '+str(len(bmp)))
    outfile.write(bmp)

###################################    
def main():

    time0 = time.clock()
    print("start {0}x{0} bmp file @ {1:.3f}".format(Size, time0))

    # set the size of the bmp image here
    BuildImage("test", (Size,Size))
    time1 = time.clock()
    print("Chart complete, run time {0:.3f} secs".format(time1-time0))
    

if __name__ == '__main__':
    main()
2. Python 2.7 Output
Output:
Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:53:40) [MSC v.1500 64 bit (AMD64)] on win32 Type "copyright", "credits" or "license()" for more information. >>> =============== RESTART: D:\Python27\MyScripts\Test\test27.py =============== start 1024x1024 bmp file @ 0.000 drawing D:/Python27/MyScripts/Test/cht_test.bmp bmphdr xy (1024, 1024) 1024 L= 3,072, delta= 3,072, pad= 0, time = 0.013, cum = 0.013 924 L= 310,272, delta=307,200, pad= 0, time = 0.163, cum = 0.175 824 L= 617,472, delta=307,200, pad= 0, time = 0.162, cum = 0.337 724 L= 924,672, delta=307,200, pad= 0, time = 0.161, cum = 0.498 624 L=1,231,872, delta=307,200, pad= 0, time = 0.179, cum = 0.677 524 L=1,539,072, delta=307,200, pad= 0, time = 0.241, cum = 0.918 424 L=1,846,272, delta=307,200, pad= 0, time = 0.267, cum = 1.185 324 L=2,153,472, delta=307,200, pad= 0, time = 0.212, cum = 1.397 224 L=2,460,672, delta=307,200, pad= 0, time = 0.259, cum = 1.656 124 L=2,767,872, delta=307,200, pad= 0, time = 0.225, cum = 1.882 24 L=3,075,072, delta=307,200, pad= 0, time = 0.234, cum = 2.115 pixels generated, len = 3145728 making bmp with 3145728 pixels headers = BM
3. Python 3.7 Code
# Testing struct.pack and string catenation in Python2 and 3
# This is a demo cut down from real app (which draws charts from survey data)
# creates a 'square rainbow' bmp file
## edit for Py 2 (char strings) or 3 (byte strings) versions

# edit these for your set up and test
Size = 1024    # test image size, pixels
path = 'D:/Python37/MyScripts/Test/'  # for the bmp file

import csv
import os
import struct
from math import trunc, ceil, floor
import time


def BuildImage(name, XY):
    # name : filename
    # XY : (width, height) pixels

    # for stats and timing
    n0 = 0
    t00 = time.clock()
    t01 = t00
    
    chtName = path+'cht_'+name+'.bmp'
    print("drawing "+chtName)
    
    hdr = bmpHdr(XY)
    #print(hdr)
    
    ##pixels =''  ##  Py2
    pixels = bytes('', 'utf-8')  ## Py3
    
    for Y in range(0, XY[1]): # (BMPs are L to R from the bottom L row)
        for X in range(0, XY[0]):
            # square rainbow for time tests -  as oposed to real data
            x = floor((255 * X)/XY[0])
            y = floor((255 * Y)/XY[1])
            (r,g,b) = [x, y, 128]   #Colour(data[x ,y])
            pixels += struct.pack('<BBB',b,g,r)
            
        row_mod = (hdr['width']*hdr['colordepth']/8) % 4
        if row_mod == 0:
            padding = 0 
        else:
            padding = (4 - row_mod)
        ##padbytes = ''  #  P2
        padbytes = bytes('', 'utf-8')  # P3
        for i in range(padding):
            padbytes += struct.pack('<B',0)
        pixels = pixels + padbytes

        # stats log
        if(0 == Y % 100 or Y == 0):
            n = len(pixels)
            t02 = time.clock()
            log = "{0:5d} L={1:8,d}, delta={2:7,d}, pad={3:4d}".format(XY[0]-Y, n, n-n0, padding)
            log += ", time = {0:6.3f}, cum = {1:7.3f}".format(t02-t01, t02-t00)
            print(log)
            t01 = t02
            n0 = n
    
    print("pixels generated, len = "+str(len(pixels)))
    bmp_write(chtName, hdr, pixels)
    

def bmpHdr(XY):
    print("bmphdr xy "+str(XY))
    hdr = {
        'mn1':66,
        'mn2':77,
        'filesize':0,
        'undef1':0,
        'undef2':0,
        'offset':54,
        'headerlength':40,
        'width':XY[0],   #256
        'height':XY[1],  #256
        'colorplanes':0,
        'colordepth':24,
        'compression':0,
        'imagesize':0,
        'res_hor':0,
        'res_vert':0,
        'palette':0,
        'importantcolors':0
        }
    return hdr


#Function to write a bmp file.  It takes a dictionary (hdr) of
#header values and the pixel data (pixels) and writes them
#to a file.  This function is called at the bottom of the code.
def bmp_write(name, hdr, pixels):
    print('making bmp with '+str(len(pixels))+" pixels")
    mn1 = struct.pack('<B',hdr['mn1'])
    mn2 = struct.pack('<B',hdr['mn2'])
    filesize = struct.pack('<L',hdr['filesize'])
    undef1 = struct.pack('<H',hdr['undef1'])
    undef2 = struct.pack('<H',hdr['undef2'])
    offset = struct.pack('<L',hdr['offset'])
    headerlength = struct.pack('<L',hdr['headerlength'])
    width = struct.pack('<L',hdr['width'])
    height = struct.pack('<L',hdr['height'])
    colorplanes = struct.pack('<H',hdr['colorplanes'])
    colordepth = struct.pack('<H',hdr['colordepth'])
    compression = struct.pack('<L',hdr['compression'])
    imagesize = struct.pack('<L',hdr['imagesize'])
    res_hor = struct.pack('<L',hdr['res_hor'])
    res_vert = struct.pack('<L',hdr['res_vert'])
    palette = struct.pack('<L',hdr['palette'])
    importantcolors = struct.pack('<L',hdr['importantcolors'])
    #create the outfile
    outfile = open(name,'wb')   # 'bitmap_image.bmp'
    #write the header + the_bytes
    hdr = mn1+mn2
    hdr += filesize+undef1+undef2
    hdr += offset+headerlength+width+height
    hdr += colorplanes+colordepth+compression+imagesize+res_hor+res_vert
    hdr += palette+importantcolors
    print("headers = "+str(hdr))
    bmp = hdr + pixels
    print('writing bmp, len = '+str(len(bmp)))
    outfile.write(bmp)

###################################    
def main():

    time0 = time.clock()
    print("start {0}x{0} bmp file @ {1:.3f}".format(Size, time0))

    # set the size of the bmp image here
    BuildImage("test", (Size,Size))
    time1 = time.clock()
    print("Chart complete, run time {0:.3f} secs".format(time1-time0))
    

if __name__ == '__main__':
    main()
4. Python 3.7 Output
Output:
Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license()" for more information. >>> =============== RESTART: D:\Python37\MyScripts\Test\test37.py =============== Warning (from warnings module): File "D:\Python37\MyScripts\Test\test37.py", line 130 time0 = time.clock() DeprecationWarning: time.clock has been deprecated in Python 3.3 and will be removed from Python 3.8: use time.perf_counter or time.process_time instead start 1024x1024 bmp file @ 0.415 Warning (from warnings module): File "D:\Python37\MyScripts\Test\test37.py", line 23 t00 = time.clock() DeprecationWarning: time.clock has been deprecated in Python 3.3 and will be removed from Python 3.8: use time.perf_counter or time.process_time instead drawing D:/Python37/MyScripts/Test/cht_test.bmp bmphdr xy (1024, 1024) Warning (from warnings module): File "D:\Python37\MyScripts\Test\test37.py", line 57 t02 = time.clock() DeprecationWarning: time.clock has been deprecated in Python 3.3 and will be removed from Python 3.8: use time.perf_counter or time.process_time instead 1024 L= 3,072, delta= 3,072, pad= 0, time = 0.006, cum = 0.006 924 L= 310,272, delta=307,200, pad= 0, time = 0.541, cum = 0.546 824 L= 617,472, delta=307,200, pad= 0, time = 1.545, cum = 2.091 724 L= 924,672, delta=307,200, pad= 0, time = 2.540, cum = 4.632 624 L=1,231,872, delta=307,200, pad= 0, time = 28.350, cum = 32.981 524 L=1,539,072, delta=307,200, pad= 0, time = 55.790, cum = 88.771 424 L=1,846,272, delta=307,200, pad= 0, time = 64.711, cum = 153.482 324 L=2,153,472, delta=307,200, pad= 0, time = 76.717, cum = 230.199 224 L=2,460,672, delta=307,200, pad= 0, time = 86.619, cum = 316.818 124 L=2,767,872, delta=307,200, pad= 0, time = 98.077, cum = 414.895 24 L=3,075,072, delta=307,200, pad= 0, time = 105.663, cum = 520.558 pixels generated, len = 3145728 making bmp with 3145728 pixels headers = b'BM\x00\x00\x00\x00\x00\x00\x00\x006\x00\x00\x00(\x00\x00\x00\x00\x04\x00\x00\x00\x04\x00\x00\x00\x00\x18\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' writing bmp, len = 3145782 Warning (from warnings module): File "D:\Python37\MyScripts\Test\test37.py", line 135 time1 = time.clock() DeprecationWarning: time.clock has been deprecated in Python 3.3 and will be removed from Python 3.8: use time.perf_counter or time.process_time instead Chart complete, run time 547.024 secs >>>
Reply


Messages In This Thread
RE: Byte string catenation inefficient in 3.7? - by RMJFlack - Aug-16-2019, 01:59 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  pyreadstat write_sav inefficient mikisDeWitte 2 2,738 Jun-21-2021, 09:49 AM
Last Post: mikisDeWitte
  'utf-8' codec can't decode byte 0xe2 in position 122031: invalid continuation byte tienttt 12 11,518 Sep-18-2020, 10:10 PM
Last Post: tienttt
  'utf-8' codec can't decode byte 0xda in position 184: invalid continuation byte karkas 8 31,652 Feb-08-2020, 06:58 PM
Last Post: karkas
  First Byte of a string is missing while receiving data over TCP Socket shahrukh1987 3 4,231 Nov-20-2019, 10:34 AM
Last Post: shahrukh1987
  HELP: String of Zero's and One's to binary byte schwasskin 1 3,864 May-19-2019, 07:31 AM
Last Post: heiner55
  4 byte hex byte swap from binary file medievil 7 22,074 May-08-2018, 08:16 AM
Last Post: killerrex
  get the content of the byte as string ricardons 5 3,674 Mar-02-2018, 02:41 PM
Last Post: ricardons
  byte string Skaperen 5 3,832 Feb-04-2018, 08:58 AM
Last Post: Gribouillis
  byte string in python2 Skaperen 4 4,335 Nov-23-2017, 03:13 AM
Last Post: Skaperen
  Does Python 3.x have a built-in byte string compare function? Raptor88 2 16,401 Feb-18-2017, 10:44 AM
Last Post: Raptor88

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020