Python Forum
Web scraper not populating .txt with scraped data
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Web scraper not populating .txt with scraped data
#3

Smile Thank you! That helped a lot, and that was such a great tip. I actually got it to work here is my code:

import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin

# Scrape the links to each monthly results page
url = "http://www.calotteryx.com/Fantasy-5/drawing-results-calendar.htm"
response = requests.get(url)
content = response.content
soup = BeautifulSoup(content, 'html.parser')
links = []
Base_url = 'http://www.calotteryx.com'
for link in soup.find_all('a', class_='noline'):
   if 'Fantasy' in link.get('href'):
      links.append(f"{Base_url}{link.get('href')}")

# Scrape the winning numbers for each monthly results page
winning_numbers = []
for link in links:
    response = requests.get(link)
    content = response.content
    soup = BeautifulSoup(content, 'html.parser')
    for tag in soup.find_all('div', class_='ball blue5 fcblack1'):
        numbers = tag.text.strip().split()
        winning_numbers.append(numbers)

# Write the winning numbers to a file
with open('winning_numbers.txt', 'w') as Nums:
    for winners in winning_numbers:
        Nums.write("%s\n" % winners)
    print('Done')
The last thing I need to figure out is how to format the data that's been scraped. Currently every digit (01,15,02,10, etc) is considered it's own string I think? So it's being written to the document via
 Nums.write("%s\n" % winners)
and it's format is vertical. After every string it creates a new line, which I think I understand why. It's the %s\n, but I want it to create a new line every 5, or I guess it would be every 10 numbers because each string has two digits in it. My solution was to try something like
Nums.write('%s %s %s %s %s\n' % winners)
but I get a TypeError: not enough arguments in string.

I appreciate being pointed in the right direction please. I'm thinking may possibly use f-strings, or format(), but am unsure if those will work like how I want them to.

anyway thanks for all the help.
Reply


Messages In This Thread
RE: Web scraper not populating .txt with scraped data - by BlackHeart - Apr-02-2023, 04:15 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Weird characters scraped samuelbachorik 3 1,129 Oct-29-2023, 02:36 PM
Last Post: DeaD_EyE
  Web scraper tomenzo123 8 4,617 Aug-18-2023, 12:45 PM
Last Post: Gaurav_Kumar
  Python Obstacles | Krav Maga | Wiki Scraped Content [Column Copy] BrandonKastning 4 2,344 Jan-03-2022, 06:59 AM
Last Post: BrandonKastning
  Python Obstacles | Kapap | Wiki Scraped Content [Column Nulling] BrandonKastning 2 1,819 Jan-03-2022, 04:26 AM
Last Post: BrandonKastning
  Image Scraper (beautifulsoup), stopped working, need to help see why woodmister 9 4,259 Jan-12-2021, 04:10 PM
Last Post: woodmister
  Any way to remove HTML tags from scraped data? (I want text only) SeBz2020uk 1 3,582 Nov-02-2020, 08:12 PM
Last Post: Larz60+
  cant loop through scraped site matt42 3 2,528 Aug-12-2020, 06:48 AM
Last Post: ndc85430
  Court Opinion Scraper in Python w/ BS4 (Currently exports to CSV) need help with SQL MidnightDreamer 4 3,132 Mar-12-2020, 09:57 AM
Last Post: BrandonKastning
  Pre-populating WTForms form values for edit danfoster 0 2,528 Feb-25-2020, 01:37 PM
Last Post: danfoster
  Python using BS scraper paulfearn100 1 2,636 Feb-07-2020, 10:22 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020