Python Forum
Errors using --processes parameter
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Errors using --processes parameter
#1
Hello all,

I'm new to Python and not even a "real" programmer, so I apologize if any of my questions show the lack of expertise ;-)

I'm trying to create "bags" (a defined structure) using a Python script that's been provided. It's called "bagit.py" and can be found on Github (https://github.com/edsu/bagit/blob/master/bagit.py). The script works very well, but considering the amount of data to be packed it runs for many hours. In a helpfile (readme.rst) I found:

Quote:Since calculating checksums can take a while when creating a bag, you may want to calculate them in parallel if you are on a multicore machine. You can do that with the --processes option:

bagit.py --processes 4 /directory/to/bag

Unfortunately following this approach leads to multiple errors (I don't know where to attach the screenshot, but I have one) and NO result at all.

As a probable solution I changed every occurance of "processes=1" into "processes=4" in the script, but that didn't help... just different error messages resulting.

Would one of you probably be able to guide me to the correct use (or syntax), please?

Thank you ever so much!
Michael
Reply
#2
first of all - show the exact command you are using, e.g. note thar --process is a CLI option, there is no equal sign like you show.
second - copy paste the error in error tags (see BBcode help for more info.).
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
sonhospa Wrote:Unfortunately following this approach leads to multiple errors (I don't know where to attach the screenshot, but I have one) and NO result at all.
These error messages that you have are the starting point to solving the problem. Find a way to add them to a post. You could perhaps redirect the command's output to a file and then copy and paste that file's content. Use error tags to post.
Reply
#4
Hello Buran and Gribouillis,

thank you for your help. I hope adding a "New Reply" is the right way here, as I couldn't see an individual reply button.

Buran, I used the command line exactly as provided in the quote above (just my own path instead of the sample path). Only after this attempt failed, as an alternative experiment, I changed the Python script (bagit.py) where the equal sign is part of several Python commands, and that's where I changed "processes=1" into "processes=4". You could see that in the link I provided to Github.

Now the errors:
a) First attempt before changing the script: "bagit.py --processes 4 path\to\bag"
Error:
2020-07-01 14:34:35,122 - INFO - Creating bag for directory E:\bagit-master\test-data\loc 2020-07-01 14:34:35,124 - INFO - Creating data directory 2020-07-01 14:34:35,124 - INFO - Moving data to E:\bagit-master\test-data\loc\tmpkstm4gie\data 2020-07-01 14:34:35,125 - INFO - Moving E:\bagit-master\test-data\loc\tmpkstm4gie to data 2020-07-01 14:34:35,126 - INFO - Using 4 processes to generate manifests: sha256, sha512 Traceback (most recent call last): File "C:\Users\Michael\AppData\Local\Programs\Python\Python37\lib\site-packages\pkg_resources\_vendor\packaging\requirements.py", line 90, in __init__ req = REQUIREMENT.parseString(requirement_string) File "C:\Users\Michael\AppData\Local\Programs\Python\Python37\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1654, in parseString raise exc File "C:\Users\Michael\AppData\Local\Programs\Python\Python37\lib\site-packages\pkg_resources\_vendor\pyparsing.py", line 1644, in parseString loc, tokens = self._parse( instring, 0 )
This is only a small part, because the error messages run endlessly (i.e. hours!). I interrupted the process after a few seconds and have a file of 7000 lines already.

b) Second attempt: Changes to the script (at 4 positions) from "processes=1" to "processes=4" and rename it. Command: "bagitprocesses4.py path\to\bag"
Error:
I couldn't reproduce the error now!
BUT: After all the files were moved to a particular 'data' direcory correctly and writing the first text files correctly, there's only one process used. The relevant part of the output is (first 6,000 lines and all the following lines left out, I marked the message):
Quote:2020-07-01 15:08:16,775 - INFO - Moving reel_08.006042.jpg to E:\Alpha-Omega\Alpha-Omega Bilder für Tests\KAMERADSCHAFT_08_dF_Fertig_jpg\tmprkg_lbc4\reel_08.006042.jpg
2020-07-01 15:08:16,776 - INFO - Moving tagmanifest-sha256.txt to E:\Alpha-Omega\Alpha-Omega Bilder für Tests\KAMERADSCHAFT_08_dF_Fertig_jpg\tmprkg_lbc4\tagmanifest-sha256.txt
2020-07-01 15:08:16,777 - INFO - Moving tagmanifest-sha512.txt to E:\Alpha-Omega\Alpha-Omega Bilder für Tests\KAMERADSCHAFT_08_dF_Fertig_jpg\tmprkg_lbc4\tagmanifest-sha512.txt
2020-07-01 15:08:16,778 - INFO - Moving E:\Alpha-Omega\Alpha-Omega Bilder für Tests\KAMERADSCHAFT_08_dF_Fertig_jpg\tmprkg_lbc4 to data
2020-07-01 15:08:16,864 - INFO - Using 1 processes to generate manifests: sha256, sha512
2020-07-01 15:08:16,888 - INFO - Generating manifest lines for file data/reel_08.000043.jpg
2020-07-01 15:08:16,935 - INFO - Generating manifest lines for file data/reel_08.000044.jpg
2020-07-01 15:08:16,965 - INFO - Generating manifest lines for file data/reel_08.000045.jpg
2020-07-01 15:08:16,983 - INFO - Generating manifest lines for file data/reel_08.000046.jpg

So it seems the attempt to change didn't result in errors like using the CLI command, but also didn't change the number of processes and was therefore obsolete.

Is that information helping?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  [ERROR] ParamValidationError: Parameter validation failed: Invalid type for parameter gdbengo 3 10,643 Dec-26-2022, 08:48 AM
Last Post: ibreeden
  processes shall be parallel flash77 4 1,066 Sep-20-2022, 11:46 AM
Last Post: DeaD_EyE
  Sharing imported modules with Sub Processes? Stubblemonster 2 1,460 May-02-2022, 06:42 AM
Last Post: Stubblemonster
  function with 'self' input parameter errors out with and without 'self' called dford 12 2,995 Jan-15-2022, 06:07 PM
Last Post: deanhystad
  Killing processes via python Lavina 2 2,560 Aug-04-2021, 06:20 AM
Last Post: warnerarc
  How to share a numpy array between 2 processes on Windows? qstdy 0 2,135 Jan-29-2021, 04:24 AM
Last Post: qstdy
  sharing variables between two processes Kiyoshi767 1 1,848 Nov-07-2020, 04:00 AM
Last Post: ndc85430
  2 or more processes on the write end of the same pipe Skaperen 4 3,790 Sep-27-2020, 06:41 PM
Last Post: Skaperen
  cv2.resize(...) shutting down processes? DreamingInsanity 1 2,237 Dec-18-2019, 04:06 PM
Last Post: DreamingInsanity
  waiting for many processes in parallel Skaperen 2 1,855 Sep-02-2019, 02:20 AM
Last Post: Skaperen

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020