Using Local Html Data - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Using Local Html Data (/thread-33886.html) |
Using Local Html Data - knight2000 - Jun-07-2021 Hi all, As a newbie, I've been trying to get some practice in webscraping by trying to extract different elements on a html page, but I decided that rather than keep hitting an actual website for data (as a newbie I have a fair few failings!), I'll temporarily save the html into a local file, so that I can load the file locally and keep practising. I've spent more than 5 hours trying to get python to read my html file and use it with BeautifulSoup and after reading about it in different places and still failing, I thought it was time to reach out for some advice. Here's the last code I tried: from bs4 import BeautifulSoup with open("C:\Users\[UserName]\Desktop\localhtml.html") as fp: soup = BeautifulSoup(fp, 'html.parser')And here's the error I'm getting: As I type in the url reference into the code, it seems to suggest it- so I assume I'm referencing it properly. I've tried looking up youtube or searching google, but perhaps I'm looking up the wrong stuff as I can't seem to find something that works. The file I'm trying to refer to is saved as a html file. I've attached it- just in case there's a problem there. Can anyone instruct me how to instruct python to open a local html file and use it with BeautifulSoup please? RE: Using Local Html Data - buran - Jun-07-2021 Don't use backslash with paths on Windows. A backslash with certain characters (in this case \U ) is escape sequence. Use raw string or double backslash or forward slash
RE: Using Local Html Data - knight2000 - Jun-07-2021 (Jun-07-2021, 03:11 AM)buran Wrote: Don't use backslash with paths on Windows. A backslash with certain characters (in this case Thanks buran. How annoyingly simple that was! Appreciate you taking the time to explain- lots to learn! |