Python Forum
Multiprocessing my Loop/Iteration (Try...Except)
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Multiprocessing my Loop/Iteration (Try...Except)
#1
Im converting one chemical notation to another type. My list has like over 6k different names to convert and it takes so long. How can I use multiprocessing? I tried to implement myself, but im a noob. Other code optimisations are welcome too!

I tried to implement multiprocessing myself, but im a noob

def resolve(str_input, representation):
    import cirpy
    return cirpy.resolve(str_input, representation)

compound_list = []
smiles_list = []

for index, row in df_Verteilung.iterrows():

    try:
        actual_smiles = resolve(row['Compound'], 'smiles')

    except:
        actual_smiles = 'Error'

    print('\r', row['Compound'], actual_smiles, end='')

    compound_list.append(row['Compound'])
    smiles_list.append(actual_smiles)

df_new = pd.DataFrame({'Compound' : compound_list, 'SmilesCode' : smiles_list})
df_new.to_csv(index=False)
Reply
#2
Hi,

there is no multiprocessing in your posted code at all...What did you try yet? Also, 6000 names doesn't sound that excessive - how long does one conversion take?

Is there any reason why you don't use the apply method of Pandas for converting the Compound column? This would make your code much easier.

Notes on your code:
* Never use naked try... except as this catches all errors, incl. programming error. Errors should be caught explicitely.
* Having a postfix stating the data type of a variable doesn't make too much sense. The data type should be clear from your code.
* The import statement in line 2 should be move to the top of your code, shouldn't be inside the function.

Regards, noisefloor
Reply
#3
I know there is no multiprocessing in it. I didn't post it, because there was always an error.
A conversion took on my MBP like 1 1/2 hour.

I tried the apply function, but it didn't work. Could you show me how the apply method works?

try..except is the only thing I know. Is there any other option?

Thank you for the fast answer noisefloor!!
Reply
#4
Hi,

Quote: I know there is no multiprocessing in it. I didn't post it, because there was always an error.
Let's say it like this: if there would have been no error, there would have been no reason to post here ;-)
The point is: even wrong code is a better starting point than no code at all. So please post your code here.

Quote: I tried the apply function, but it didn't work. Could you show me how the apply method works?
What did you try? Show the code. "It did not work" is not helpful as an error message.

Quote: try..except is the only thing I know. Is there any other option?
except accepts an argument to catch certain exceptions, e.g. except IndexError would catch index errors only, no syntax errors or name errors or ...
These are Python basics, so I'd recommend to read again the official documentation or the corresponding section in the tutorial on this.

Regards, noisefloor
Reply
#5
That was my try implementing multiprocessing. The error was : unexpected indent.

from multiprocessing import Pool

    def resolve(str_input, representation):
        try:
            import cirpy
            res =  cirpy.resolve(str_input, representation)
        except:
            res = "Error"

        print('\r', row['Compound'], res, end='')
        return res

    compound_list = [row['Compound'] for row in df_Verteilung.iterrows()]

    n = 5

    with Pool(processes=n) as pool:
        smiles_list = pool.starmap(resolve, [(row['Compound'], 'smiles') for row in df_Verteilung.iterrows()])

    df_new = pd.DataFrame({'Compound' : compound_list, 'SmilesCode' : smiles_list})
    df_new.to_csv(index=False)
This was my try for the apply-function:

import CIRpy
def resolve(x):
    return cirpy.resolve(str_input, "smiles")

    df2["Compound"] = df2["Compound"].apply(resolve)
Thank you again noisefloor!!
Reply
#6
Hi,

multiprocessing looks ok on a first glance. Which error do you get?

On the data frame: after consulting the docs of Pandas, apply isn't the right message. You may try map instead, see: https://stackoverflow.com/questions/3496...gle-column

Regards, noisefloor
Reply
#7
First I got: unexpected indent in line 3
but then nothing happened no error, but I can't see any results...

Im gonna try map now and post the code afterwards.

Noisefloor, thank you very much. You carrying my ass!!!
Reply
#8
Hi,

well I though the code in the previous post had wrong indention because of a C&P error. But if your code is REALLY like this, line 3 to end are indendet 4 spaces to much. This won't run.

Regards, noisefloor
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Inconsistent loop iteration behavior JonWayn 2 954 Dec-10-2022, 06:49 AM
Last Post: JonWayn
  PyRun_SimpleFile calling multiprocessing Python Class cause endless init loop Xeno 2 988 Sep-19-2022, 02:32 AM
Last Post: Xeno
  Pool multiprocessing - know current status in loop? korenron 0 1,604 Jul-28-2021, 08:49 AM
Last Post: korenron
  saving each iteration of a loop sgcgrif3 3 6,642 Jul-27-2021, 01:02 PM
Last Post: DeaD_EyE
  python multiprocessing import Pool, cpu_count: causes forever loop | help to remove Hassibayub 0 1,826 Jun-18-2020, 05:27 PM
Last Post: Hassibayub
  String slicing and loop iteration divyansh 9 4,617 Jun-07-2020, 10:29 PM
Last Post: divyansh
  Changing a variable's name on each iteration of a loop rix 6 84,043 Jan-03-2020, 07:06 AM
Last Post: perfringo
  Parallel iteration with for loop Josh_Python890 1 2,131 Jul-19-2019, 11:50 PM
Last Post: metulburr
  parallel for loop with multiprocessing dervast 0 1,989 Jul-04-2019, 03:16 PM
Last Post: dervast
  First for loop stops after first iteration Divanova94 10 8,719 May-01-2019, 04:27 PM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020