Python Forum
How to support multiple users with heavy data processing
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to support multiple users with heavy data processing
#1
Hi All,

Forgive my ignorance on the realities of networking but...how can I support many users simultaneously with my (cloud-hosted) scraping app ? I have written a demo that supports only a single user:

After user input (via browser) the script does several cycles of scraping and crunching - which usually takes 5-10 mins. (Users will get the output by email). This is fine for a single user, but how could I support 100 or even 1000 users simultaneously? A separate script running for every single user?! :-O

Queue? - not really feasible because users wont wait for their report for very long
Multi-thread? - this will slow things down (each thread will take longer than it would as a single thread - right?)

...so I am left wondering if I have to have one script running for every single user ?!

Somewhere else I saw a reference to Twisted, but I am not sure how this fits.

There must be other web apps or cloud-based services that have to deliver serious real-time data crunching to users. How do they do that?

Thanks a lot for any help in advance - I really appreciate any advice.
Reply
#2
When your processing takes 15 minutes, the user have to wait 15 minutes.
The program structure can be following:
  • Manager process, which receives the tasks and sending them to a free worker process
  • Worker processes which are started by Manager
  • Asynchronous WebInterface which is doing the communication between Manager <> Client.
  • E-Mail Client which sends finished work to the user, which can be accessed for a duration on the WebServer.

This are my thoughts. Maybe you can make it simpler, but I guess you'll end in a big message queue forward and backward.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#3
Thanks for that. I appreciate it.

In order to support 1000 simultaneous 15-min requests, I am going to need 1000 workers - right?

Joe
Reply
#4
Or faster machines. Or you optimize the function which is doing the work.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  how to split socket data onto multiple clients for separate processing ConcealedFox70 0 1,903 Jan-11-2022, 08:26 PM
Last Post: ConcealedFox70
  Socket won't receive data suddenly with multiple clients SquareRoot 0 2,768 Sep-06-2017, 09:09 PM
Last Post: SquareRoot

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020