Python Forum
Scraping all website text using Python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Scraping all website text using Python
#1
I am very very new to Python at all (so sorry in advance for asking stupid questions). I have an excel sheet with a unique company identifier and the respective URLs next to it for a couple of companies.

What I would like to do is to open the URL and save all the website text (the complete text from the first page of the website) for each of the companies to a separate .txt-file. The name of the file should be the unique identifier from the excel sheet.

Did someone of you something similar in the past or could help me with the code on that task?

That would be great!!
Reply
#2
I suggest that you go through snippsat's web scraping tutorials here:
web scraping part 1
web scraping part 2
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Retrieve website content using Python? Vadanane 1 1,256 Jan-16-2023, 09:55 AM
Last Post: Axel_Erfurt
  web scraping for new additions/modifed website? kingoman123 4 2,239 Apr-14-2022, 04:46 PM
Last Post: snippsat
  I want to create an automated website in python mkdhrub1 2 2,401 Dec-27-2021, 11:27 PM
Last Post: Larz60+
  Scraping lender data from Ren Ren Dai website using Python. I will pay for that 200$ Hafedh_2021 1 2,753 May-18-2021, 08:41 PM
Last Post: snippsat
  Python to build website Methew324 1 2,229 Dec-15-2020, 05:57 AM
Last Post: buran
  Scraping text from application? kamix 1 1,575 Sep-25-2020, 10:53 PM
Last Post: Larz60+
  Python Webscraping with a Login Website warriordazza 0 2,600 Jun-07-2020, 07:04 AM
Last Post: warriordazza
  Scraping a Website (HELP) LearnPython2 1 1,746 May-08-2020, 03:20 PM
Last Post: Larz60+
  scraping from a website that hides source code PIWI_Protein 1 1,959 Mar-27-2020, 05:08 PM
Last Post: Larz60+
  Scraping not moving to the next pages in a website jithin123 0 1,943 Mar-23-2020, 06:10 PM
Last Post: jithin123

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020