How to import data to excel - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: How to import data to excel (/thread-23930.html) |
How to import data to excel - jkessous - Jan-23-2020 hi Guys, I have been working on a script that does parsing to a website (thanks to snippsat). I am trying to import the data that is recieved to excel. I know pandas is a strong tool but have no clue how to use it :) the code that does the parsing is: from requests import Session from bs4 import BeautifulSoup as bs with Session() as s: site = s.get("https://connectedinvestors.com/login") bs_content = bs(site.content, "html.parser") token = bs_content.find("input", {"name":"infusion_id"})["value"] login_data = {"email":"[email protected]","password":"password", "infusion_id":token} s.post("https://connectedinvestors.com/login",login_data) import requests from bs4 import BeautifulSoup url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/2' url_get = requests.get(url) soup = BeautifulSoup(url_get.content, 'html.parser') for tag in soup.find_all('div', class_="investorcard clearfix"): h4 = tag.find('h4') print(h4.a.text.strip(), tag.attrs['id'])the output of the code is like this: what i need is that the output will be imported to an excel file with two columns (Name and ID).thanks RE: How to import data to excel - ibreeden - Jan-24-2020 The most simple way would be to write the data to a file named *.CSV (Comma Separated Values). Excel is able to open such files. Then you would have to add a field separator between the values. This field separator is usually a comma but it may also be a semicolon. What Excel really accepts as a field separator is defined in the regional settings of your computer as "List separator". So you should alter your print statement as follows: print(h4.a.text.strip() + ',' + tag.attrs['id']) RE: How to import data to excel - snippsat - Jan-24-2020 Link to Thread. As shown bye @ibreeden,can make so it's CSV and import that to Excel. This is the simplest way. Can show how it's done if include Pandas as mention in other Thread. Can save df.to_excel directly to .xlsx format.import requests from bs4 import BeautifulSoup import pandas as pd url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/2' url_get = requests.get(url) soup = BeautifulSoup(url_get.content, 'html.parser') record = {} for tag in soup.find_all('div', class_="investorcard clearfix"): h4 = tag.find('h4') record[h4.a.text.strip()] = tag.attrs['id'] df = pd.DataFrame(list(record.items()), columns=['Name', 'ID']) print(df) df.to_excel("output.xlsx", index=False) In Excel:RE: How to import data to excel - jkessous - Jan-25-2020 Hi Snippsat, The code works great. I have another small question. The site connected investors has thousands of members going by URL's like this url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/2' url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/3' url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/4' url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/5' url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/6' and so on. how can i make this script to run recursively and adding the data to the excel file. appreciate it a lot Jonathan RE: How to import data to excel - snippsat - Jan-26-2020 You should try to write something yourself. Make eg a list and loop,could also save record to json then could add/merge the dictionary json return. Other way look for a append way for df.to_excel ,or write all at once if that's possible.Example of merge dictionaries,from 3.7 are order dictionaries guaranteed. >>> r1 = {'Jonathan Kessous': '454517', 'Lawanda Osler': '549633'} >>> r2 = {'Naren Mistry': '514075', 'EJ Joier': '540484'} >>> new = {**r1, **r2} >>> new {'Jonathan Kessous': '454517', 'Lawanda Osler': '549633', 'Naren Mistry': '514075', 'EJ Joier': '540484'} import requests from bs4 import BeautifulSoup import pandas as pd import json def save_record(record): with open("record.json", "w") as j_out: json.dump(record, j_out) def open_record(): with open("record.json") as j: saved_record = json.load(j) return saved_record url_lst = [ 'https://connectedinvestors.com/member/jonathan-kessous/friends/2', 'https://connectedinvestors.com/member/jonathan-kessous/friends/3' ] record = {} for url in url_lst: url_get = requests.get(url) soup = BeautifulSoup(url_get.content, 'html.parser') for tag in soup.find_all('div', class_="investorcard clearfix"): h4 = tag.find('h4') record[h4.a.text.strip()] = tag.attrs['id'] if __name__ == '__main__': save_record(record) df = pd.DataFrame(list(record.items()), columns=['Name', 'ID']) print(df) df.to_excel("output.xlsx", index=False) #print(open_record()) |