How can I solve this file handling issue?

GiggsB · Feb-12-2022, 06:52 AM

I am using SPI communication protocol, where a microcontroller sends data to the Raspberry Pi. Raspberry Pi collects this data and stores it in a file. I am using python script to collect this data and store it into a file.

I am sending ~800,000 bytes (8-bit integers) from microcontroller. On Raspberry Pi, after 2048 bytes are collected, 4 bytes are joined together to form a 32-bit integer and stored in a file and then the loop starts again to collect next set of 2048 bytes.

The problem is that all of the data is received within 3 seconds (I ran the script without any file handling operations), but with file handling it takes a total of ~20 seconds. Is there an efficient way where I can reduce the time from 20 seconds to less than 5-10 seconds?

I am not using struct.unpack() function because first of all, it expects an array of length 4, so I would needto create another array and store 4 values and then use that function, making it take more time. Secondly, Link shows that bit shifting operation takes less time than struct.unpack(). Thanks.

import time
import spidev
import array as arr
import struct
import datetime as datetime
import os
import pigpio

# We only have SPI bus 0 available to us on the Pi
bus = 0
#Device is the chip select pin. Set to 0 or 1, depending on the connections
device = 0

# Enable SPI
spi = spidev.SpiDev()

# Open a connection to a specific bus and device (chip select pin)
spi.open(bus, device)

# Set SPI speed and mode
spi.max_speed_hz = 4000000
spi.mode = 0

pi=pigpio.pi()
pi.set_mode(25, pigpio.INPUT)
k=0
value=[]

def output_file_path():
    return os.path.join(os.path.dirname(__file__),
               datetime.datetime.now().strftime("%dT%H.%M.%S") + ".csv")

print("Enter '1' to start the process")
a=input()

if a:
    print("SM1 Process started...")
    spi.xfer2([0x01])
    while True:
        if pi.wait_for_edge(25, pigpio.RISING_EDGE, 5.0):
            print("Detected")
            with open(output_file_path(), 'w') as f:
                t1=datetime.datetime.now()
                data=[0]*2048
                for x in range(392):
                    spi.xfer2(data)
                    #value=struct.unpack("<I", bytearray(data))[0]
                    for y in range(0,2048,4):
                        value=data[y]<<24 | data[y+1]<<16 | data[y+2]<<8 | data[y+3]
                        f.write(str(value)+'\n')
                f.close()
                t2=datetime.datetime.now()
            print(t2-t1)
            break
else: 
    print("Wrong input.")

stevendaprano · Feb-12-2022, 07:40 AM

Hi GiggsB.

Quote:The problem is that all of the data is received within 3 seconds (I ran the script without any file handling operations),

Can you show that script that uses no file handling operations? I don't have a Raspberry Pi to run your code, but there is nothing there that looks slow except for reading the data from the Pi, who knows how long that will take?

To me, it looks like:

You create a new file name (very fast, almost instantaneous)
392 times, you collect 2048 values from the Pi. That could be slow.
You grab those 2048 values and convert them to 32-bit ints, which will be fast. There's only 2048 of them, Python will process them in a millisecond. (On my computer, it takes a bit more than half a millisecond.)
You write those 32-bit ints to a file as text, which is almost instantaneous.

On my computer, the last two steps together take about 1.6 milliseconds. Suppose you double that time (3.2 ms) then repeating it 392 times will take about 1.2 seconds. If there is anything slow, it can only be in reading the data in the first place.

**Gribouillis** · Feb-12-2022, 08:07 AM

@GiggsB The link you showed was about Ruby code, not Python code. It says nothing about Python.

GiggsB · Feb-12-2022, 08:08 AM

Quote:Can you show that script that uses no file handling operations?

Hi Stevendaprano,
Thank you for the reply. I did some tests and have shared the results plus the code.

1. Test 1:
With file handling and the code same as in the question:

2. Test 2:
Writing the raw data (without converting them into 32-bits, so no bit-shifting operation) into file. For this, I just commented out the inner "y" loop and bit shifting operation. This takes ~5 sec.

3. Test 3:
Without any write operation. The code just opens and closes the file but does not writes anything. It takes ~2-3 sec.

Code for this test:

import time
import spidev
import array as arr
import struct
import datetime as datetime
import os
import pigpio

# We only have SPI bus 0 available to us on the Pi
bus = 0
#Device is the chip select pin. Set to 0 or 1, depending on the connections
device = 0

# Enable SPI
spi = spidev.SpiDev()

# Open a connection to a specific bus and device (chip select pin)
spi.open(bus, device)

# Set SPI speed and mode
spi.max_speed_hz = 4000000
spi.mode = 0

pi=pigpio.pi()
pi.set_mode(25, pigpio.INPUT)
k=0
value=[]

def output_file_path():
    return os.path.join(os.path.dirname(__file__),
               datetime.datetime.now().strftime("%dT%H.%M.%S") + ".csv")

print("Enter '1' to start the process")
a=input()

if a:
    print("SM1 Process started...")
    spi.xfer2([0x01])
    while True:
        if pi.wait_for_edge(25, pigpio.RISING_EDGE, 5.0):
            print("Detected")
            with open(output_file_path(), 'w') as f:
                t1=datetime.datetime.now()
                data=[0]*2048
                for x in range(392):
                    spi.xfer2(data)
                f.close()
                t2=datetime.datetime.now()
            print(t2-t1)
            break
else: 
    print("Wrong input.")

This makes me believe that most of the time is taken by the bit shifting operation (most probably) or the inner loop.
Thanks.

GiggsB · Feb-12-2022, 08:14 AM

(Feb-12-2022, 08:07 AM)Gribouillis Wrote: @GiggsB The link you showed was about Ruby code, not Python code. It says nothing about Python.

Thanks for the reply.
Yes, you are right. But I was thinking that struct.unpack() function is a function of python and the bit shifting function is just a common operation and doesn't matter what language we use...so I as considering it as a good valid comparison for my case.

But in my code as I mentioned, I would need to create another array of 4 values and pass 4 values every time inside the for loop and pass to unpack() function so it would consume more time. Please correct me if I am wrong.

**Gribouillis** · Feb-12-2022, 08:22 AM

GibbsB Wrote:But in my code as I mentioned, I would need to create another array of 4 values and pass 4 values every time inside the for loop and pass to unpack() function so it would consume more time. Please correct me if I am wrong.

I don't think these performance guesses are relevant. Python integers are not C integers for example as they are unlimited in size, also when you write data[y] << 24 for example, it creates a new instance of integer.

Don't guess performance, measure it.

GiggsB · Feb-12-2022, 08:24 AM

(Feb-12-2022, 08:22 AM)Gribouillis Wrote: Don't guess performance, measure it.

Oh, I see. Thank you for clarifying. I will run the tests right away.

stevendaprano · Feb-12-2022, 08:26 AM

(Feb-12-2022, 06:52 AM)GiggsB Wrote: Secondly, Link shows that bit shifting operation takes less time than struct.unpack().

You are linking to a seven year old post for a completely different programming language.

For Python 3.10 on my computer running Linux, I can unpack 2048 1-byte ints into 512 4-byte (32-bit) ints in less than a 1 millisecond. Plenty fast enough.

import struct
import time

L = [255, 0, 0, 0, 0, 0, 0, 1, 255, 255, 255, 255, 0, 0, 255, 0]*16
assert len(L) == 2048
format = ">" +"I"*512
t = time.time()
x = struct.unpack(format, bytes(L))
assert len(x) == 512
print(time.time() - t)

I haven't bothered to compare it to manual unpacking and bit operations, but I am 95% sure that struct.unpack will be faster.

stevendaprano · Feb-12-2022, 08:42 AM

I've re-written your code, removing some unneeded imports and redundant code, and putting in some timing code to show you where the time is actually being spent:

# Untested.
import datetime
import os
import struct

import pigpio
import spidev

# We only have SPI bus 0 available to us on the Pi
bus = 0
#Device is the chip select pin. Set to 0 or 1, depending on the connections
device = 0

# Enable SPI
spi = spidev.SpiDev()
# Open a connection to a specific bus and device (chip select pin)
spi.open(bus, device)
# Set SPI speed and mode
spi.max_speed_hz = 4000000
spi.mode = 0

pi = pigpio.pi()
pi.set_mode(25, pigpio.INPUT)

def output_file_path():
    return os.path.join(os.path.dirname(__file__),
               datetime.datetime.now().strftime("%dT%H.%M.%S") + ".csv")

input("Press Enter to start the process ")
print("SM1 Process started...")
spi.xfer2([0x01])
while True:
    if pi.wait_for_edge(25, pigpio.RISING_EDGE, 5.0):
        print("Detected")
        data = [0]*2048
        
        with open(output_file_path(), 'w') as f:
            for x in range(392):
                t1 = datetime.datetime.now()
                spi.xfer2(data)
                print("time taken reading data:", datetime.datetime.now() - t1)
                t1 = datetime.datetime.now()
                for y in range(0, 2048, 4):
                    value=data[y]<<24 | data[y+1]<<16 | data[y+2]<<8 | data[y+3]
                    f.write(str(value) + '\n')
                print("time taken writing data:", datetime.datetime.now() - t1)
        break

My prediction is that writing will not take anywhere near as much time as you think.

If that code works as you expect, then try changing these three lines:

for y in range(0, 2048, 4):
    value=data[y]<<24 | data[y+1]<<16 | data[y+2]<<8 | data[y+3]
    f.write(str(value) + '\n')

into this:

values = struct.unpack(">" +"I"*512, bytes(data))
f.write('\n'.join([str(n) for n in values]))

and see if it is faster. (Disclaimer: I have not run this code, it may contain typos or other errors.)

GiggsB · Feb-12-2022, 08:51 AM

Yes, my bad ..really sorry for posting the wrong link. However, I ran test to compare both strategies and 90% of the times, bit shifting operation was faster than struct.unpack()

Link for the image
Sorry couldn't find option to place image in the reply. Please check my results from the above link.

Stevendaprano, I will work on the suggestions and let you know the results. Thanks.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	File Handling not working properly	TheLummen	8	751	Feb-17-2024, 07:47 PM Last Post: TheLummen
	file handling Newbee question	middlecope	2	784	Jan-18-2023, 03:09 PM Last Post: middlecope
	python exception handling handling .... with traceback	mg24	3	1,284	Nov-09-2022, 07:29 PM Last Post: Gribouillis
	Delimiter issue with a CSV file	jehoshua	1	1,301	Apr-19-2022, 01:28 AM Last Post: jehoshua
	File handling issue	GiggsB	4	1,448	Mar-31-2022, 09:35 PM Last Post: GiggsB
	How to solve this file handling issue?	GiggsB	3	1,703	Jan-10-2022, 09:36 AM Last Post: Gribouillis
	File handling	knollfinder	3	2,058	Jun-28-2020, 07:39 PM Last Post: knollfinder
	Writing to File Issue	Flash_Stang	3	2,532	Jun-05-2020, 05:14 AM Last Post: Gribouillis
	file handling	sivareddy	1	1,643	Feb-23-2020, 07:28 PM Last Post: jefsummers
	Simple Read File Issue	blackjesus24	4	2,776	Feb-09-2020, 12:07 AM Last Post: blackjesus24

How can I solve this file handling issue?

User Panel Messages

Announcements