Python Forum
web scraping to csv formatting problems
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
web scraping to csv formatting problems
#1
Hello,

I am trying to scrape a web page and send the result to CSV. I am able to get the content I want in the CSV. However, the content is being repeated down the page and unique info is sent across the page, instead of down the page under the headers.

This is the result I'm getting: CSV Output

The CSV should list the accounts one per line, going down and not across as in this example. This is the original wiki page that I'm scraping (had to block out company info): Original Wiki Page

This is the code I am using:
import csv
import os
import requests
from requests import get
from requests.exceptions import RequestException
from contextlib import closing
from bs4 import BeautifulSoup

output_dir = os.path.join( '..', 'output_files', 'aws_accounts_list')
source = 'aws_wiki_page'
destination = os.path.join(output_dir, source + '.csv' )
url = 'https://wiki.us.cworld.company.com/display/6TO/AWS+Accounts'
page = requests.get(url, auth=('me', 'secret'))

headers = ['Company Account Name', 'AWS Account Name', 'Description', 'LOB', 'AWS Account Number', 'Connected to Homebase', 'Peninsula or Island', 'URL', 'Owner', 'Engagement Code', 'CloudOps Access Type']

soup = BeautifulSoup(page.text, 'lxml')

rows = []
for tr in soup.select('tr'):
    rows.append([td.text for td in soup.select('td')])

with open(destination, 'w+', newline='') as csvfile:
    writer = csv.writer(csvfile, delimiter=',',
                            quotechar='"', quoting=csv.QUOTE_MINIMAL)
    writer.writerow(headers)
    for row in rows:
        writer.writerow(row)
        print(row)
What am I doing wrong?
Reply


Messages In This Thread
web scraping to csv formatting problems - by bluethundr - Jul-02-2019, 08:38 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Website Scraping Problems JamesWilson 1 913 Jul-01-2024, 09:46 AM
Last Post: Larz60+
  Scraping problems with Python requests. gtlhbkkj 1 2,439 Jan-22-2020, 11:00 AM
Last Post: gtlhbkkj
  Scraping problems. Pls help with a correct request query. gtlhbkkj 0 1,948 Oct-09-2019, 12:00 PM
Last Post: gtlhbkkj
  Scraping problems. Pls help with a correct request query. gtlhbkkj 6 4,376 Oct-01-2019, 09:22 PM
Last Post: gtlhbkkj
  Formatting Output After Web Scraping yoitspython 3 4,302 Aug-01-2019, 01:22 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020