Python Forum
How to support multiple users with heavy data processing - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Networking (https://python-forum.io/forum-12.html)
+--- Thread: How to support multiple users with heavy data processing (/thread-11031.html)



How to support multiple users with heavy data processing - Gingmeister - Jun-19-2018

Hi All,

Forgive my ignorance on the realities of networking but...how can I support many users simultaneously with my (cloud-hosted) scraping app ? I have written a demo that supports only a single user:

After user input (via browser) the script does several cycles of scraping and crunching - which usually takes 5-10 mins. (Users will get the output by email). This is fine for a single user, but how could I support 100 or even 1000 users simultaneously? A separate script running for every single user?! :-O

Queue? - not really feasible because users wont wait for their report for very long
Multi-thread? - this will slow things down (each thread will take longer than it would as a single thread - right?)

...so I am left wondering if I have to have one script running for every single user ?!

Somewhere else I saw a reference to Twisted, but I am not sure how this fits.

There must be other web apps or cloud-based services that have to deliver serious real-time data crunching to users. How do they do that?

Thanks a lot for any help in advance - I really appreciate any advice.


RE: How to support multiple users with heavy data processing - DeaD_EyE - Jun-19-2018

When your processing takes 15 minutes, the user have to wait 15 minutes.
The program structure can be following:
  • Manager process, which receives the tasks and sending them to a free worker process
  • Worker processes which are started by Manager
  • Asynchronous WebInterface which is doing the communication between Manager <> Client.
  • E-Mail Client which sends finished work to the user, which can be accessed for a duration on the WebServer.

This are my thoughts. Maybe you can make it simpler, but I guess you'll end in a big message queue forward and backward.


RE: How to support multiple users with heavy data processing - Gingmeister - Jun-19-2018

Thanks for that. I appreciate it.

In order to support 1000 simultaneous 15-min requests, I am going to need 1000 workers - right?

Joe


RE: How to support multiple users with heavy data processing - DeaD_EyE - Jun-19-2018

Or faster machines. Or you optimize the function which is doing the work.