Python Forum
How to run existing python script parallel using multiprocessing
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to run existing python script parallel using multiprocessing
#1
Hi

I had a python script which will search for directories containing the specific name and then search for the error files in those directories.
I want to run this script as a process, so that I can run 10 processes parallel at a time.
How can I acheive this ? with multi processing or multi threading?
Suggest me the best way and how to call this python script

import connect_to_hbase
import glob
import os
import csv
import numpy as np
connection= connect_to_hbase.conn
table = connection.table(connect_to_hbase.table_name_Source)
row_key =  '\x00\x00\x00\x01' 
res = list()
for row_key, data in table.scan(columns=['DETAILS:APP_ID']):
	result=data.values()
	for i in result:
		res.append(i)
		
x = np.array(res)
z=np.unique(x)
print(z)
patterns = str(z)
table = connect_to_hbase.conn.table(connect_to_hbase.table_name_Target)
base_path = '/ai2/data/dev/admin/inf/*{}*'
for pattern in patterns:
	
	search_path =  base_path.format(pattern)
	for f in glob.glob(search_path):
		print("-----------------------")
		print ("The directory path is:")
		print f
		print("List of files in the directory are:")
		os.chdir('/ai2/data/dev/admin/inf/')
    		os.chdir(f)
    		cwd = os.getcwd()	
		for subdir, dirs, files in os.walk(cwd, topdown=True):
				for file23 in glob.glob('*.err'):
					print file23
Output:
Connected to Hbase ['ACM' 'ACX' 'AW' 'BC' 'BLS' 'CA' 'CLP' 'CMU' 'CR' 'CSE' 'CTD' 'DHD' 'DMS' 'DRM' 'GSK' 'IPT' 'XU0'] ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_IPT_pvt List of files in the directory are: run_ingest_IPT_daily_1246.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_CLP_pvt List of files in the directory are: run_ingest_CLP_daily_1240.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_IPT_pvt List of files in the directory are: run_ingest_IPT_daily_1246.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_CTD_pvt List of files in the directory are: run_ingest_CTD_daily_1250.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_IPT_pvt List of files in the directory are: run_ingest_IPT_daily_1246.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_XU0_pvt List of files in the directory are: t_itm.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_ACX_pvt List of files in the directory are: accountshighfocus.err acm_access_log.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_XU0_pvt List of files in the directory are: t_itm.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_CMU_pvt List of files in the directory are: run_ingest_CMU_daily_1247.err -----------------------
Thanks
Reply
#2
Threads in python should only be used for input and output tasks. They are not like threads in other programming languages. When you have several threads started they would all wait until the current running thread pauses. Since you are not using shared variables and the only shared thing (the connection) is to read, I would recommend you multiprocesses. They are dangerous if you work with shared variables and write simultaniously, but then you will achieve different programms that run parallel and your computational time reduces
Reply
#3
Hi,

I had a python script that searches for an item in a list.
Now my requirement is to start a separate thread for each item search in the for loop.
How can I achieve this?
Please help

import glob
import os
import csv
patterns = ['ACM','ACX','AW','BC']
for pattern in patterns:
	base_path = '/ai2/data/dev/admin/inf/*{}*'
	search_path =  base_path.format(pattern)
	for f in glob.glob(search_path):
		print("-----------------------")
		print ("The directory path is:")
		print f
		print("List of files in the directory are:")
		os.chdir('/ai2/data/dev/admin/inf/')
		os.chdir(f)
		cwd = os.getcwd()	
		for subdir, dirs, files in os.walk(cwd, topdown=True):
			for file23 in glob.glob('*.err'):
				print file23
In the above script for every "pattern" value, I need to start a separate thread

Output:
[root@edgenod]# python p123.py ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_ACM_pvt List of files in the directory are: dsplit.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_ACX_pvt List of files in the directory are: accountshighfocus.err acm_access_log.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_AW_pvt List of files in the directory are: aware.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_BC_pvt List of files in the directory are: run_ingest_BC_daily_1249.err
Reply
#4
Hi,

I had a python script that searches for an item in a list.
Now my requirement is to start a separate thread for each item search in the for loop.
How can I achieve this?
Please help
import glob
import os
import csv
patterns = ['ACM','ACX','AW','BC']
for pattern in patterns:
    base_path = '/ai2/data/dev/admin/inf/*{}*'
    search_path =  base_path.format(pattern)
    for f in glob.glob(search_path):
        print("-----------------------")
        print ("The directory path is:")
        print f
        print("List of files in the directory are:")
        os.chdir('/ai2/data/dev/admin/inf/')
        os.chdir(f)
        cwd = os.getcwd()   
        for subdir, dirs, files in os.walk(cwd, topdown=True):
            for file23 in glob.glob('*.err'):
                print file23
In the above script for every "pattern" value, I need to start a separate thread

Output:
[root@edgenod]# python p123.py ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_ACM_pvt List of files in the directory are: dsplit.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_ACX_pvt List of files in the directory are: accountshighfocus.err acm_access_log.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_AW_pvt List of files in the directory are: aware.err ----------------------- The directory path is: /ai2/data/dev/admin/inf/inf_BC_pvt List of files in the directory are: run_ingest_BC_daily_1249.err
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Multiprocessing on python sawtooth500 12 455 Apr-02-2024, 06:03 PM
Last Post: sawtooth500
  Python - Merge existing cells of Excel file created with xlsxwriter manonB 0 3,713 Mar-10-2021, 02:17 PM
Last Post: manonB
  Python Parallel Programing wissam1974 6 8,769 Feb-25-2019, 08:48 PM
Last Post: wissam1974
  Updating the Pandas dataframe to existing excel workbook in existing worksheet. sanmaya 1 9,754 Jul-01-2018, 06:23 PM
Last Post: volcano63

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020