Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Scraping Issue with BS
#12
Fantastic!

This does the job. The next thing I need to do is grab the city, state and category for each listing. But it only appears in two places, the URL and at the top of the page. Then I will add them to the dictionary using RegEx.

PAGE = 0
while True:
    html = get_html(session, BASE_URL, PAGE)

    listings = get_listings(html)
    for listing in listings:
        print(listing['company'], listing['phone'],
              listing['rating'], end='\n')
    button = html.select('button.page-next.\@px-1.\@ml-1')
    if button[0].attrs.get('disabled') == 'disabled':
        break

    PAGE += 25
I also have to read each URL from a file instead of hard coding it. In the following code, I am appending each state > city > category > listings then writing the rows to a CSV file. One question, how do I only write the column names one time?

def save_csv(listings, filename):
    filename = 'home-advisor-data-{}.csv'.format(state)
    with open(filename, 'a', encoding='utf-8', newline='') as file:
        writer = csv.writer(file, delimiter=',')
        writer.writerow(['Company', 'Phone Number', 'Rating']) #While paginating through each page of results, it will write these literal columns.
                                                                                      #How do I avoid this? I only want these at the top, once.
        for listing in listings:
            writer.writerow(
                [listing['company'], listing['Phone_Number'], listing['Rating']])
Reply


Messages In This Thread
Scraping Issue with BS - by muzikman - Dec-07-2021, 11:49 AM
RE: Scraping Issue with BS - by snippsat - Dec-07-2021, 04:17 PM
RE: Scraping Issue with BS - by muzikman - Dec-07-2021, 10:36 PM
RE: Scraping Issue with BS - by muzikman - Dec-08-2021, 02:07 PM
RE: Scraping Issue with BS - by snippsat - Dec-08-2021, 02:14 PM
RE: Scraping Issue with BS - by muzikman - Dec-08-2021, 02:19 PM
RE: Scraping Issue with BS - by muzikman - Dec-08-2021, 07:38 PM
RE: Scraping Issue with BS - by muzikman - Dec-08-2021, 09:37 PM
RE: Scraping Issue with BS - by muzikman - Dec-09-2021, 02:09 PM
RE: Scraping Issue with BS - by muzikman - Dec-09-2021, 02:10 PM
RE: Scraping Issue with BS - by snippsat - Dec-09-2021, 04:10 PM
RE: Scraping Issue with BS - by muzikman - Dec-10-2021, 08:49 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Web scraping Possible JavaScript issue johnboy1974 2 2,099 Apr-11-2021, 08:53 AM
Last Post: johnboy1974
  Web scraping: webbrowser issue Truman 10 7,125 Jul-11-2018, 11:57 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020