Python Forum
Parallel processing and distributed computing with Python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Parallel processing and distributed computing with Python
#1
What are the most popular approaches for parallel processing and distributed computing using Python.

1. Does it worth to use Python for such tasks?
2. What are the most general and efficient frameworks/libraries/packages for them?

I'd like so much to hear a complex answer, which describes state of things in this subject.
Reply
#2
See also https://wiki.python.org/moin/ParallelProcessing
Reply
#3
It depends on what are you doing. Data processing, heavy math calculations or something else. You could use all the cores of a PC, some asynchronous execution ( doing something else while the core waits for a data from a disk storage, RAM, network ) or even spread the calculations on many PCs. If you are more specific we could point to some approaches and libraries. There are lots of modules you can use for concurrent programming.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#4
Thanks for the answer!
Well, I mean more heavy math calculations and data processing, especially machine learning / deep learning computations and algorithms.
Reply
#5
You could start with built-in concurrent.futures.ProcessPoolExecutor which is a wrapper around multiprocessing module. If heavy computations are involved I will recommend gmpy2 library. I've used it for a while. Perhaps Scikit-learn for the machine learning part. I am saying perhaps because I've never done machine learning before. I think installing this module you will get Numpy and SciPy as well. And I think this is enough to start.

Quick google search for this Deep Learning thing and seems Keras is on top of the list.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#6
(Nov-25-2017, 09:57 PM)GarlicScience Wrote: Well, I mean more heavy math calculations and data processing, especially machine learning / deep learning computations and algorithms.
There are a lot going on in that field,as Python is maybe most used now in that field. 

Dask
Scales up: Runs resiliently on clusters with 1000s of cores

Dask.distributed
It extends both the concurrent.futures and dask APIs to moderate sized clusters.

TensorFlow
The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop,
server, or mobile device with a single API.

Both of this library can take advantages of also GPU.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Colossus - The Greatest Secret in the History of Computing ThomasL 0 1,848 May-05-2020, 11:41 AM
Last Post: ThomasL
  Distributed size limited queue implementation? johsmi96 1 1,941 May-08-2019, 07:29 AM
Last Post: DeaD_EyE
  Python cloud computing projects giteepag 2 57,297 Aug-17-2018, 06:06 AM
Last Post: giteepag
  Reportlab: PDF Processing with Python - Mike Driscoll's new project on Kickstarter buran 0 2,957 Jan-31-2018, 09:12 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020