Running A Parser In VSCode - And Write The Results Into A Csv-File - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Running A Parser In VSCode - And Write The Results Into A Csv-File (/thread-31530.html) |
Running A Parser In VSCode - And Write The Results Into A Csv-File - apollo - Dec-17-2020 hi there - good day dear python-experts. running a parser in VSCode - and write the results into a csv-file i ve got a tiny error on a import requests from bs4 import BeautifulSoup import re import csv from tqdm import tqdm first = "https://path ?page={}" second = "https://path /{}_en" def catch(url): with requests.Session() as req: pages = [] print("Loading All IDS\n") for item in tqdm(range(0, 347)): r = req.get(url.format(item)) soup = BeautifulSoup(r.content, 'html.parser') numbers = [item.get("href").split("/")[-1].split("_")[0] for item in soup.findAll( "a", href=re.compile("^path/"), class_="btn btn-default")] pages.append(numbers) return numbers def parse(url): links = catch(first) with requests.Session() as req: with open("Data.csv", 'w', newline="", encoding="UTF-8") as f: writer = csv.writer(f) writer.writerow(["Name", "Address", "Site", "Phone", "Description", "Scope", "Rec", "Send", "PIC", "OID", "Topic"]) print("\nParsing Now... \n") for link in tqdm(links): r = req.get(url.format(link)) soup = BeautifulSoup(r.content, 'html.parser') task = soup.find("section", class_="col-sm-12").contents name = task[1].text add = task[3].find( "i", class_="fa fa-location-arrow fa-lg").parent.text.strip() try: site = task[3].find("a", class_="link-default").get("href") except: site = "N/A" try: phone = task[3].find( "i", class_="fa fa-phone").next_element.strip() except: phone = "N/A" desc = task[3].find( "h3", class_="eyp-project-heading underline").find_next("p").text scope = task[3].findAll("span", class_="pull-right")[1].text rec = task[3].select("tbody td")[1].text send = task[3].select("tbody td")[-1].text pic = task[3].select( "span.vertical-space")[0].text.split(" ")[1] oid = task[3].select( "span.vertical-space")[-1].text.split(" ")[1] topic = [item.next_element.strip() for item in task[3].select( "i.fa.fa-check.fa-lg")] writer.writerow([name, add, site, phone, desc, scope, rec, send, pic, oid, "".join(topic)]) parse(second)see the output - python /home/martin/dev/vscode/euro.py martin@mx:~ $ python /home/martin/dev/vscode/euro.py Loading All IDS 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 347/347 [08:01<00:00, 1.39s/it] Traceback (most recent call last): File "/home/martin/dev/vscode/euro.py", line 65, in <module> parse(second) File "/home/martin/dev/vscode/euro.py", line 29, in parse with open("Data.csv", 'w', newline="", encoding="UTF-8") as f: TypeError: file() takes at most 3 arguments (4 given) martin@mx:~well i think that i have an error here with open("Data.csv", 'w', newline="", encoding="UTF-8") as f:i guess i need to have a closer look at the arguments here RE: Running A Parser In VSCode - And Write The Results Into A Csv-File - bowlofred - Dec-17-2020 Looks like you might be using python3 options but you are running it under python2. $ python3 -c 'open("in", "w", newline="", encoding="UTF-8")' $ python2 -c 'open("in", "w", newline="", encoding="UTF-8")' Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: file() takes at most 3 arguments (4 given) RE: Running A Parser In VSCode - And Write The Results Into A Csv-File - jefsummers - Dec-17-2020 Yes indeed. Drop the newline and UTF parameters and I bet it works fine. RE: Running A Parser In VSCode - And Write The Results Into A Csv-File - apollo - Jan-14-2021 good day dear JeffSummers many many thanks for the quick answer -great to hear from you . i did as you adviced but i guess that i have gotten some errors doings so.. i run the code like so def parse(url): links = catch(first) with requests.Session() as req: with open("Data.csv", 'w') as f: writer = csv.writer(f) writer.writerow(["Name", "Address", "Site", "Phone", "Description", "Scope", "Rec", "Send", "PIC", "OID", "Topic"]) print("\nParsing Now... \n")but now i have some issues: martin@mx:~ $ python /home/martin/dev/vscode/euro.py Loading All IDS 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:15<00:00, 1.56s/it] Parsing Now... 5%|█████▎ | 1/20 [00:02<00:40, 2.14s/it] Traceback (most recent call last): File "/home/martin/dev/vscode/euro.py", line 65, in <module> parse(second) File "/home/martin/dev/vscode/euro.py", line 62, in parse scope, rec, send, pic, oid, "".join(topic)]) UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 9: ordinal not in range(128) martin@mx:~ $well i guess i have have a UnicodeEncodeError, seems that my system default encoding isn't utf-8, therefor, should i do some extra thing to avoid issues here!? should i try with df.to_csv("data.csv", index=False, encoding="utf-8") but then it will not work again... RE: Running A Parser In VSCode - And Write The Results Into A Csv-File - apollo - Jan-14-2021 hi again update - well i run Python 2.7.1 u will install and update the system - to run with version 3xy i hope that i will be successful - i guess that i can do this again $ python3 -c 'open("in", "w", newline="", encoding="UTF-8")' $ python2 -c 'open("in", "w", newline="", encoding="UTF-8")' Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: file() takes at most 3 arguments (4 given)since i nee to take care for the decoding options..see the above mentioned issues a UnicodeEncodeError, seems that my system default encoding isn't utf-8, therefor, should i do some extra thing to avoid issues here!? should i try with df.to_csv("data.csv", index=False, encoding="utf-8")step one: i will update the python to 3.xy step two: i will add all the arguments - so that we have $ python3 -c 'open("in", "w", newline="", encoding="UTF-8")' $ python2 -c 'open("in", "w", newline="", encoding="UTF-8")' Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: file() takes at most 3 arguments (4 given)look forward to hear from you RE: Running A Parser In VSCode - And Write The Results Into A Csv-File - snippsat - Jan-14-2021 You should look at setup VS Code and how it work with Python. It's not hard to see what version you use as it show it always down in left corner. VS Code from start Overview image of my setup with Python and Code Runner as the most important extensions. |