Python Forum
python Multithreading on single file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
python Multithreading on single file
#1
Hi Team,

I want to use multi threading. on single csv files.


task one ----> create gzip of csv files
task two ------> Checksum no.generate check no.


Below is my attempted code, is this correct way of doing it.

import threading
class MyClass:
    pass

    def Create_checksum(path):
        print("checksum Created")

    def Create_gzip(path):
        print("gzip_Created")


def main():
    path = "d:\\output\\abc.csv"
    # Call Checksum Function to generate Checksum no

    data = MyClass

    t1 = threading.Thread(target=data.Create_checksum, args=(path,))
    t2 = threading.Thread(target=data.Create_gzip, args=(path,))

    t1.start()
    t2.start()

    t1.join()
    t2.join()

    # both threads completely executed
    print("Done!")


if __name__ == "__main__":
	main()
Reply
#2
You can not create a class like this.
If want want place funcrion in class look at @staticmethod.
Make no sense here so just remove the class.
Also to check that it works the time should be around 5-sec and not 10-sec.
# th.py
import threading
from time import sleep

def create_checksum(path):
    sleep(5)
    print("checksum Created")

def create_gzip(path):
    sleep(5)
    print("gzip_Created")

def main():
    path = "d:\\output\\abc.csv"
    # Call Checksum Function to generate Checksum no
    t1 = threading.Thread(target=create_checksum, args=(path,))
    t2 = threading.Thread(target=create_gzip, args=(path,))
    t1.start()
    t2.start()
    t1.join()
    t2.join()
    # both threads completely executed
    print("Done!")

if __name__ == "__main__":
    main()
Output:
=== python th.py === checksum Created gzip_Created Done! Execution time: 5.228 s
Reply
#3
Hi Snippsat,

Thanks for your help. is sleep is compulsory in multithreading what it does.


I am interested to know multiprocessing
Which is best multiprocessing or multithreading in my above scenario.

how to use multiprocessing for above task.


#The Pool class
#Another and more convenient approach for
#simple parallel processing tasks is provided by the Poolclass.
# there are 4 methods that are
# Pool.apply
# Pool.map
# Pool.apply_async
# Pool_map_async
#
# The Pool.apply and Pool.map methods are basically equivalent to pythons
# built apply and map functions.


import multiprocessing
import os
def square(n):
    print("Worker process id for {0}:{1}".format(n,os.getpid()))
    return(n*n)

if __name__=="__main__":
    #input list
    arr = [1, 2, 3, 4, 5]

    #creating a pool object
    p = multiprocessing.Pool()

    #map list to target function
    result = p.map(square,arr)

    print("Square of each elements:")
    print(result)
Reply
#4
(Nov-04-2022, 10:54 PM)mg24 Wrote: Thanks for your help. is sleep is compulsory in multithreading what it does.
No,sleep() is just used a as longer woking task to test that threading work.
Without threading the total time will be as excpectet 10-sec.
# pp.py
from time import sleep

def create_checksum(path):
    sleep(5)
    print("checksum Created")

def create_checksum(path):
    sleep(5)
    print("gzip_Created")

path = "d:\\output\\abc.csv"
create_checksum(path)
create_checksum(path)
Output:
=== python pp.py === gzip_Created gzip_Created Execution time: 10.247 s
Quote:I am interested to know multiprocessing
Which is best multiprocessing or multithreading in my above scenario.
Look into concurrent.futures then is eaiser to switch between threading, multiprocessing as both use same interface.
To write a example.
import concurrent.futures
import os

def square(n):
    print("Worker process id for {0}:{1}".format(n, os.getpid()))
    return(n ** n)

if __name__ == '__main__':
    with concurrent.futures.ProcessPoolExecutor(max_workers=8) as executor:
        lst = [1000000, 2000000, 3000000, 1000000, 2000000, 3000000]
        for n in lst:
            executor.submit(square, n)
Output:
=== python sq.py === Worker process id for 1000000:7184 Worker process id for 2000000:13784 Worker process id for 3000000:20116 Worker process id for 1000000:16140 Worker process id for 2000000:23144 Worker process id for 3000000:20544 Execution time: 32.575 s
If set max_workers=1 also now use only 1 core in CPU.
Output:
=== python sq.py === Worker process id for 1000000:20700 Worker process id for 2000000:20700 Worker process id for 3000000:20700 Worker process id for 1000000:20700 Worker process id for 2000000:20700 Worker process id for 3000000:20700 Execution time: 94.146 s

Also a advice when look into this so is Python used a lot for data science,
eg Machine Learning they need a lot power so much of the innovation are happening there.
I think many forget that tools used for these task can work on any Python code,some examples.
Pytorch | torch.multiprocessing is a drop in replacement for Python’s multiprocessing module.
Dask | short overview
Ray
So these can eg use GPU, distributed systems ect... that can eg blow away perfomance using standar Python tool like threading, multiprocessing.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python Tkinter Simple Multithreading Question AaronCatolico1 5 1,468 Dec-14-2022, 11:35 PM
Last Post: deanhystad
  python sql query single quote in a string mg24 1 993 Nov-18-2022, 08:01 PM
Last Post: deanhystad
  Create multiple/single csv file for each sql records mg24 6 1,323 Sep-29-2022, 08:06 AM
Last Post: buran
  multithreading Hanyx 4 1,282 Jul-29-2022, 07:28 AM
Last Post: Larz60+
Question Problems with variables in multithreading Wombaz 2 1,286 Mar-08-2022, 03:32 PM
Last Post: Wombaz
  [SOLVED] Input parameter: Single file or glob? Winfried 0 1,539 Sep-10-2021, 11:54 AM
Last Post: Winfried
  Remove single and double quotes from a csv file in 3 to 4 column shantanu97 0 6,925 Mar-31-2021, 10:52 AM
Last Post: shantanu97
  Multithreading question amadeok 0 1,746 Oct-17-2020, 12:54 PM
Last Post: amadeok
  How do I write a single 8-bit byte to a file? MysticLord 2 2,741 Sep-03-2020, 12:27 PM
Last Post: MysticLord
  How can i add multithreading in this example WoodyWoodpecker1 3 2,448 Aug-11-2020, 05:30 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020