Python Forum

Full Version: Multiprocessing my Loop/Iteration (Try...Except)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Im converting one chemical notation to another type. My list has like over 6k different names to convert and it takes so long. How can I use multiprocessing? I tried to implement myself, but im a noob. Other code optimisations are welcome too!

I tried to implement multiprocessing myself, but im a noob

def resolve(str_input, representation):
    import cirpy
    return cirpy.resolve(str_input, representation)

compound_list = []
smiles_list = []

for index, row in df_Verteilung.iterrows():

    try:
        actual_smiles = resolve(row['Compound'], 'smiles')

    except:
        actual_smiles = 'Error'

    print('\r', row['Compound'], actual_smiles, end='')

    compound_list.append(row['Compound'])
    smiles_list.append(actual_smiles)

df_new = pd.DataFrame({'Compound' : compound_list, 'SmilesCode' : smiles_list})
df_new.to_csv(index=False)
Hi,

there is no multiprocessing in your posted code at all...What did you try yet? Also, 6000 names doesn't sound that excessive - how long does one conversion take?

Is there any reason why you don't use the apply method of Pandas for converting the Compound column? This would make your code much easier.

Notes on your code:
* Never use naked try... except as this catches all errors, incl. programming error. Errors should be caught explicitely.
* Having a postfix stating the data type of a variable doesn't make too much sense. The data type should be clear from your code.
* The import statement in line 2 should be move to the top of your code, shouldn't be inside the function.

Regards, noisefloor
I know there is no multiprocessing in it. I didn't post it, because there was always an error.
A conversion took on my MBP like 1 1/2 hour.

I tried the apply function, but it didn't work. Could you show me how the apply method works?

try..except is the only thing I know. Is there any other option?

Thank you for the fast answer noisefloor!!
Hi,

Quote: I know there is no multiprocessing in it. I didn't post it, because there was always an error.
Let's say it like this: if there would have been no error, there would have been no reason to post here ;-)
The point is: even wrong code is a better starting point than no code at all. So please post your code here.

Quote: I tried the apply function, but it didn't work. Could you show me how the apply method works?
What did you try? Show the code. "It did not work" is not helpful as an error message.

Quote: try..except is the only thing I know. Is there any other option?
except accepts an argument to catch certain exceptions, e.g. except IndexError would catch index errors only, no syntax errors or name errors or ...
These are Python basics, so I'd recommend to read again the official documentation or the corresponding section in the tutorial on this.

Regards, noisefloor
That was my try implementing multiprocessing. The error was : unexpected indent.

from multiprocessing import Pool

    def resolve(str_input, representation):
        try:
            import cirpy
            res =  cirpy.resolve(str_input, representation)
        except:
            res = "Error"

        print('\r', row['Compound'], res, end='')
        return res

    compound_list = [row['Compound'] for row in df_Verteilung.iterrows()]

    n = 5

    with Pool(processes=n) as pool:
        smiles_list = pool.starmap(resolve, [(row['Compound'], 'smiles') for row in df_Verteilung.iterrows()])

    df_new = pd.DataFrame({'Compound' : compound_list, 'SmilesCode' : smiles_list})
    df_new.to_csv(index=False)
This was my try for the apply-function:

import CIRpy
def resolve(x):
    return cirpy.resolve(str_input, "smiles")

    df2["Compound"] = df2["Compound"].apply(resolve)
Thank you again noisefloor!!
Hi,

multiprocessing looks ok on a first glance. Which error do you get?

On the data frame: after consulting the docs of Pandas, apply isn't the right message. You may try map instead, see: https://stackoverflow.com/questions/3496...gle-column

Regards, noisefloor
First I got: unexpected indent in line 3
but then nothing happened no error, but I can't see any results...

Im gonna try map now and post the code afterwards.

Noisefloor, thank you very much. You carrying my ass!!!
Hi,

well I though the code in the previous post had wrong indention because of a C&P error. But if your code is REALLY like this, line 3 to end are indendet 4 spaces to much. This won't run.

Regards, noisefloor