Python Forum
Read Data with multiprocessing
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Read Data with multiprocessing
#1
Hey guys,

I am trying to get into 'multiprocessing' since a few days. I have a few hundred files which contain a few thousand numbers which I want to save in a matrix. I found a code in the internet that seems applicable for my problem, but I did not get into it by now (I am relatively new to python). Here is my changed version of the code (explanation further down)

import numpy as np
import multiprocessing
import math

def writedata(rs,out_q):
 data_sim=np.zeros((4281,80))
 for kkk in rs:
  rs_str=str(kkk)
  fnrs = '../OUTPUT_FILES' + '/Datei' + rs_str 
  data_sim_rec=np.genfromtxt(fnrs)
  data_sim_rec=np.array([data_sim_rec[:,1]]).T
  data_sim[:,kkk]=data_sim_rec[:,0]   
 out_q.put(data_sim)

rs=np.arange(1,80,1)
nprocs=10
out_q=multiprocessing.Queue()
chunksize=int(math.ceil(len(rs)/float(nprocs)))
procs=[]

for i in range(nprocs):
    p=multiprocessing.Process(target=writedata,args=(rs[chunksize*i:chunksize*(i+1)],out_q))
    procs.append(p)
    p.start()
resultdict={}

for i in range(nprocs):  
 resultdict.update((out_q.get()))

for p in procs:
    p.join()

data_sim=resultdict
I define a function called writedata which goes into a folder with exemplary 80 files, reads in the data in data_sim_rec and saves it in a Matrix data_sim in the appropriate row. This matrix shall be saved into a queue. I distribute the jobs to the processors by a list with numbers for the loop. Then, the data shall be saved in resultdict. The error message is:
 dictionary update sequence element #0 has length 80, 2 is required.
I know that I probably mix data types and that the 'update'-line is wrong as well, but I already tried many things which did not work.

I would be very happy about an answer!

Thanks in advance and best regards,

Max

sorry, here's the whole traceback

Traceback (most recent call last):
  File "FORUM.py", line 31, in <module>
    resultdict.update((out_q.get()))
ValueError: dictionary update sequence element #0 has length 80; 2 is required
Reply
#2
Use a Manager list or dictionary to communicate between/to/from processes. A list would possibly work in this case with the function appending the url to a Manager list. The program would then check the list every second or so and print it, removing the items as they are printed if you want, or just reprinting all of the urls otherwise. See "Sharing State Between Processes" at https://www.cs.colorado.edu/~kena/classe...entati.pdf for an example. Post back if you have problems.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Photo read matlab data pz16 1 1,320 Oct-06-2023, 11:00 PM
Last Post: snippsat
Smile How to further boost the data read write speed using pandas tjk9501 1 1,228 Nov-14-2022, 01:46 PM
Last Post: jefsummers
  Read json array data by pandas vipinct 0 1,901 Apr-13-2020, 02:24 PM
Last Post: vipinct
  Python read Excel cell data validation anantpatil 0 4,105 Jan-31-2020, 04:57 PM
Last Post: anantpatil
  Read data, recognize trends and send report vin0001 1 2,088 Oct-02-2019, 06:08 AM
Last Post: buran
  Read CSV data into Pandas DataSet From Variable? Oliver 7 13,785 Jul-05-2018, 03:29 AM
Last Post: answerquest
  Need help in framing data read from wav file Vishweshkumar 1 3,628 Feb-10-2017, 01:45 PM
Last Post: sparkz_alot

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020