Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to import data to excel
#1
hi Guys,

I have been working on a script that does parsing to a website (thanks to snippsat).
I am trying to import the data that is recieved to excel. I know pandas is a strong tool but have no clue how to use it :)

the code that does the parsing is:
from requests import Session
from bs4 import BeautifulSoup as bs

 
with Session() as s:
    site = s.get("https://connectedinvestors.com/login")
    bs_content = bs(site.content, "html.parser")
    token = bs_content.find("input", {"name":"infusion_id"})["value"]
    login_data = {"email":"[email protected]","password":"password", "infusion_id":token}
    s.post("https://connectedinvestors.com/login",login_data)

import requests
from bs4 import BeautifulSoup

url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/2'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'html.parser')
for tag in soup.find_all('div', class_="investorcard clearfix"):
    h4 = tag.find('h4')
    print(h4.a.text.strip(), tag.attrs['id'])
the output of the code is like this:
Output:
... print(h4.a.text.strip(), tag.attrs['id']) ... Jonathan Kessous 454517 Dorothy Lopez 5374 Malachi Minenger 503265 Carla Moore 492866 Marty Gergen 404449 Daniel Nevins 326033 Mike Abadir 406439 Meeco White 342115 Craig Dell 505777 Keiara Smith 409082 Rebecca Jones 527554 jeff black 358077 Steven Stoughton 488587 Brian Newell 539314 Curtis Dafler 410079 Marco Leone 19002 Oscar Robertson 522916 Nilson Almonte Noesi 472794 Nikia Epiconepropertiesllc 508013 Andrew Tuggle 342415 R. Gibsonproperties 542422 Dominic Picione 543130 Yaron Kendelker 253869 Charles Jones 539585 Eric Brewer 543986 Ginis Garcia 7206 Richard Williams 466632 Lloyd Pierson 538263 Matthew Lorimer 372537 Nate Hedrick 501890 Steven Jackson 521127 >>>
what i need is that the output will be imported to an excel file with two columns (Name and ID).

thanks
Reply
#2
The most simple way would be to write the data to a file named *.CSV (Comma Separated Values). Excel is able to open such files. Then you would have to add a field separator between the values. This field separator is usually a comma but it may also be a semicolon. What Excel really accepts as a field separator is defined in the regional settings of your computer as "List separator".
So you should alter your print statement as follows:
print(h4.a.text.strip() + ',' + tag.attrs['id'])
Reply
#3
Link to Thread.
As shown bye @ibreeden,can make so it's CSV and import that to Excel.
This is the simplest way.

Can show how it's done if include Pandas as mention in other Thread.
Can save df.to_excel directly to .xlsx format.
import requests
from bs4 import BeautifulSoup
import pandas as pd

url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/2'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'html.parser')
record = {}
for tag in soup.find_all('div', class_="investorcard clearfix"):
    h4 = tag.find('h4')
    record[h4.a.text.strip()] = tag.attrs['id']

df = pd.DataFrame(list(record.items()), columns=['Name', 'ID'])
print(df)
df.to_excel("output.xlsx", index=False)
Output:
Name ID 0 Jonathan Kessous 454517 1 Tracy Williams 970885 2 Amanda Taylor 552089 3 Jason Stemo 1091251 4 Gerardo Marina 647958 5 Robin Bogin 377596 6 Foster Algier 311283 .....ect
In Excel:
[Image: QCHzbO.png]
Reply
#4
Hi Snippsat,
The code works great.
I have another small question.
The site connected investors has thousands of members going by URL's like this
url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/2'
url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/3'
url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/4'
url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/5'
url = 'https://connectedinvestors.com/member/jonathan-kessous/friends/6'
and so on.
how can i make this script to run recursively and adding the data to the excel file.

appreciate it a lot
Jonathan
Reply
#5
You should try to write something yourself.
Make eg a list and loop,could also save record to json then could add/merge the dictionary json return.
Other way look for a append way for df.to_excel,or write all at once if that's possible.

Example of merge dictionaries,from 3.7 are order dictionaries guaranteed.
>>> r1 = {'Jonathan Kessous': '454517', 'Lawanda Osler': '549633'}
>>> r2 = {'Naren Mistry': '514075', 'EJ Joier': '540484'}

>>> new = {**r1, **r2}
>>> new
{'Jonathan Kessous': '454517', 'Lawanda Osler': '549633', 'Naren Mistry': '514075', 'EJ Joier': '540484'}
import requests
from bs4 import BeautifulSoup
import pandas as pd
import json

def save_record(record):
    with open("record.json", "w") as j_out:
        json.dump(record, j_out)

def open_record():
    with open("record.json") as j:
        saved_record = json.load(j)
        return saved_record

url_lst = [
    'https://connectedinvestors.com/member/jonathan-kessous/friends/2',
    'https://connectedinvestors.com/member/jonathan-kessous/friends/3'
]

record = {}
for url in url_lst:
    url_get = requests.get(url)
    soup = BeautifulSoup(url_get.content, 'html.parser')
    for tag in soup.find_all('div', class_="investorcard clearfix"):
        h4 = tag.find('h4')
        record[h4.a.text.strip()] = tag.attrs['id']

if __name__ == '__main__':
    save_record(record)
    df = pd.DataFrame(list(record.items()), columns=['Name', 'ID'])
    print(df)
    df.to_excel("output.xlsx", index=False)
    #print(open_record())
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Copy data from Excel and paste into Discord (Midjourney) Joe_Wright 4 1,923 Jun-06-2023, 05:49 PM
Last Post: rajeshgk
  Plot a pandas data fram via pyqtgraph with an modul import and qt designer widget Nietzsche 0 801 May-29-2023, 02:42 PM
Last Post: Nietzsche
Video doing data treatment on a file import-parsing a variable EmBeck87 15 2,665 Apr-17-2023, 06:54 PM
Last Post: EmBeck87
  Reading data from excel file –> process it >>then write to another excel output file Jennifer_Jone 0 1,046 Mar-14-2023, 07:59 PM
Last Post: Jennifer_Jone
  Python 3.11 data import question love0715 2 771 Mar-05-2023, 06:50 PM
Last Post: snippsat
  Import XML file directly into Excel spreadsheet demdej 0 798 Jan-24-2023, 02:48 PM
Last Post: demdej
  How to properly format rows and columns in excel data from parsed .txt blocks jh67 7 1,798 Dec-12-2022, 08:22 PM
Last Post: jh67
  Trying to Get Arduino sensor data over to excel using Python. eh5713 1 1,616 Dec-01-2022, 01:52 PM
Last Post: deanhystad
  Appending a row of data in an MS Excel file azizrasul 3 1,138 Nov-06-2022, 05:17 PM
Last Post: azizrasul
  Moving data from one Excel to another and finding maximum profit azizrasul 7 1,411 Oct-06-2022, 06:13 PM
Last Post: azizrasul

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020