Python Forum
Multithreaded file transfer from multiple volumes
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Multithreaded file transfer from multiple volumes
#1
I've written a script that copies .mp4 files from multiple volumes on the mac, which all works as intended, but am looking at improving the speeds and was looking into using multiple threads to see if that would improve things. Im reasonably new to Python and only write scripts to make my life easier so I'm not too familiar with multithreading etc.

A brief overview of what I'm doing and trying to achieve:
We film in 360 where it saves 8k video files from 6 cameras to 6 SD cards and then 1 SD card for the metadata. I then plug those 7 SD cards into the MacBook via a USB 3.0 hub and import using the Cameras software. The issue I get here is that it takes around 2hrs to transfer 225GB of data. After writing some code to run through the cards copying the files one by one, I managed to get this down to 47min for 225GB.

I'm just wondering if i can get it any quicker?

Cheers

Heres the code I have at moment:

import os
import shutil
import time
from itertools import chain
from concurrent.futures import ThreadPoolExecutor

local_directory_path = '/Users/chris/Desktop/Filming/'
sd_card_paths = ('/Volumes/Untitled', '/Volumes/Untitled 1', '/Volumes/Untitled 2',
                 '/Volumes/Untitled 3', '/Volumes/Untitled 4', '/Volumes/Untitled 4',
                 '/Volumes/Untitled 5', '/Volumes/Untitled 6')


def convert(seconds):
    # converts seconds into minutes

    seconds = seconds % (24 * 3600)
    hour = seconds // 3600
    seconds %= 3600
    minutes = seconds // 60
    seconds %= 60

    return "%d:%02d:%02d" % (hour, minutes, seconds)


def transfer_all_files():
    # Walk through the folders on the sd card

    for path, dirs, files in chain.from_iterable(os.walk(path) for path in sd_card_paths):

        for file_name in files:

            # Set the paths variables for the source and destination
            src_path = os.path.join(path, file_name)
            dst_folder = local_directory_path + path.split(os.path.sep)[-1]

            # Check if the folder exists and create folder if not
            if not os.path.exists(dst_folder):
                os.mkdir(dst_folder)

            if not file_name == '.pro_suc':

                print(f"Copying file: {path}/{file_name}")

                dst_path = local_directory_path + path.split(os.path.sep)[-1] + '/' + file_name
                shutil.copy(src_path, dst_path)


def verify_copied_files():
    # Will check the file on the SD card and compare the sizes

    print('Verifying all transferred files')
    pass


if __name__ == '__main__':
    startTime = time.time()
    transfer_all_files()
    verify_copied_files()
    executionTime = (time.time() - startTime)
    print('Copied in: ' + convert(executionTime))
Reply
#2
concurrent futures is easier to use than multiprocessing, but it's threading, thus the ( concurrent ) prefix.
Can only read data from one input at a time, it's just time sliced which gives the appearance of parallelism, but it's not.

Multiprocessing will allow tasks to run in parallel so should improve overall performance.
Take a look at some available packages: https://pypi.org/search/?q=fast+parallel...+reading&o=
(I can't recommend one over the other, some will be better than others, use the 'trending' search to arrange in most popular order)

You might think about streaming the camera data directly using multiprocessing and eliminating the SD card completely, if you can access the raw video (time consuming (software wise), but vendor may provide code).
skoobi likes this post
Reply
#3
Thanks for the reply. Will look into that.
On the old camera we were able to plug an SSD drive in which was ideal, but it also only needed 1 sdcard. On the new camera they've added 7x SD cards so it records 8k video from 6 cameras and copies direct on separate sdcards with no option of adding an ssd drive via usb-c which id have preferred. The process at the moment is we have to remove the cards, plug all the cards into a usb sdcard adapter, plug them all into the hub, then import using the mac software for stitching the video files. We never use that software for stitching, only importing all the files in one go, but its just so slow, which prompted me to brush off the dust on my limited python knowledge and try and get it to transfer better.

I'll test the two methods and see which is quicker.

Many thanks for the rely
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How do I properly implement restarting a multithreaded python application? MrFentazis 1 636 Jul-17-2023, 09:10 PM
Last Post: JamesSmith
  file transfer via python SFTP SCP mg24 3 3,010 Sep-15-2022, 04:20 AM
Last Post: mg24
Question resume file transfer onran 0 1,646 Jan-27-2021, 02:16 PM
Last Post: onran
  Remote File Transfer- Error wonderboy 1 1,687 Jan-06-2021, 11:24 AM
Last Post: wonderboy
  code to read files in folders and transfer the file name, type, date created to excel Divya577 0 1,871 Dec-06-2020, 04:14 PM
Last Post: Divya577
  Multithreaded HTTP Server sbguy01 0 6,242 Oct-21-2020, 04:42 PM
Last Post: sbguy01
  DNP3 file transfer with python Aussie 0 1,462 Sep-01-2020, 10:45 PM
Last Post: Aussie
  Splitting the audio file into smaller packets before transfer using UDP protocol in p MuhammadAli152 0 3,739 May-15-2020, 03:01 PM
Last Post: MuhammadAli152
  How to stop Xmodem after bin file transfer was finished shaya2103 0 2,526 Nov-27-2019, 04:33 PM
Last Post: shaya2103
  Multiple entries into a csv file SatansClaw 1 2,020 Jun-13-2019, 07:06 PM
Last Post: stullis

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020