Python Forum

Full Version: Need logic on how to scrap 100K URLs
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
I request you to explain to me the logic of how to proceed with the requirement.

My requirement is-

I have a website say www.example.com and once I log in, I have to search for a product. Then the website returns me things like -
  1. The demand for the product
  2. The supply for the product
  3. Medium sales for the product
  4. Maximum sales for the product

In another line, it gives me the' total number of this product'.

In another line, it gives me the most important information which is -
'Other related products'

These related products are like 'product name 123', 'product name 236, 'product name 483', etc. Once you click on all these related product, it will have a similar page with the same type of information like -
  1. The demand for the product
  2. The supply for the product
  3. Medium sales for the product
  4. Maximum sales for the product
In another line, it gives me the' total number of this product'.

In another line, it gives me the most important information which is -
'Other related products' etc and then some process has to be followed with each product.

What can be a python script logic which reads one URL and get all the information of that URL like -
demand, supply, medium sales, maximum sales, the total number of the product.

Then, it should click on all the related products one by one and extract all this information. so, it will open a chain of URLs as each product will have some related products and each related product has its own related project.

In this way, one URL will simultaneously open 100K URLs in the browser. So, to summarize, how I can proceed with the logic to extract information from around 100K URLs. The information which I want is -

  1. demand
  2. supply
  3. medium sales
  4. maximum sales
  5. total number of products on sales
you can start here, doesn't take long, and you'll learn a lot about the basics

web scraping part 1
web scraping part 2
(Jun-29-2020, 08:28 AM)Larz60+ Wrote: [ -> ]you can start here, doesn't take long, and you'll learn a lot about the basics

web scraping part 1
web scraping part 2

Thanks for the Reply. I am going through the posts which you shared but my requirement is completely different.

The posts do not cover any similar logic. I guess I have to learn advanced stuff and then proceed.

What do you suggest?