Python Forum
Download entire web pages and save them as html file with urllib.request
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Download entire web pages and save them as html file with urllib.request
#1
I can save multiple web pages with using these codes; however, I cant see a proper website view after saving them as html. For example, the texts in table are slipped and images can't be seen. I need to download entire pages just as we do save as in any web browser so that I can see a proper view.
import urllib.request

url= 'https://asd.com/asdID='
for i in range(1, 5):
    print('     --> ID:', i)
    newurl = url + str(i)
    f = open(str(i)+'.html', 'w')
    page = urllib.request.urlopen(newurl)
    pagetext = str(page.read())
    f.write(pagetext)
    f.close()
Reply


Messages In This Thread
Download entire web pages and save them as html file with urllib.request - by fyec - Jul-13-2018, 07:34 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  FTP Download of Last File jland47 4 407 Mar-16-2024, 09:15 AM
Last Post: Pedroski55
  Open/save file on Android frohr 0 337 Jan-24-2024, 06:28 PM
Last Post: frohr
  Make entire script run again every 45 mo NDillard 0 324 Jan-23-2024, 09:40 PM
Last Post: NDillard
  how to save to multiple locations during save cubangt 1 560 Oct-23-2023, 10:16 PM
Last Post: deanhystad
  Need to replace a string with a file (HTML file) tester_V 1 775 Aug-30-2023, 03:42 AM
Last Post: Larz60+
  urllib can't find "parse" rjdegraff42 6 2,215 Jul-24-2023, 05:28 PM
Last Post: deanhystad
  save values permanently in python (perhaps not in a text file)? flash77 8 1,249 Jul-07-2023, 05:44 PM
Last Post: flash77
  download a file from a URL JayManPython 7 1,370 Jun-28-2023, 07:52 AM
Last Post: JayManPython
  Save and Close Excel File avd88 0 3,082 Feb-20-2023, 07:19 PM
Last Post: avd88
  Tkinterweb (Browser Module) Appending/Adding Additional HTML to a HTML Table Row AaronCatolico1 0 937 Dec-25-2022, 06:28 PM
Last Post: AaronCatolico1

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020