Python Forum
Extracting Headers from Many Pages Quickly
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extracting Headers from Many Pages Quickly
#3
So, a bit of necromancy, but it turns out that requests.Session() can build multiple thread pools with one for each domain you connect to, recycling pools if the number of domains exceeds the limit. Threads will fail to return results if you try to open more than the pool max size, but I was able to set the number of pools equal to the number of domains, spawn a thread for each domain, then have each domain thread spawn pool_max_size child threads. The final results were much faster than anything I've ever seen go through this network, so I was quite pleased. Unfortunately, I don't think I'll be allowed to share code as this was for work, but I hope this helps anyone who faces a similar issue in the future.
Reply


Messages In This Thread
RE: Extracting Headers from Many Pages Quickly - by OstermanA - Oct-01-2019, 08:01 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  HTTP Headers as constants in stdlib kirans 9 5,507 Feb-03-2019, 03:38 AM
Last Post: kirans

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020