Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Don't know what went wrong
#1
I was trying to make a webcrawler to get the years only of a specific page only as practice but it failed
seems fine to me but not have any idea what is wrong

import requests
from bs4 import BeautifulSoup

def search_year (page_number):
    url = "https://b-ok.asia/s/Python?page=" + str(page_number)
    source_code = requests.get(url)
    plain_text = source_code.text
    soup = BeautifulSoup(plain_text)
    for link in soup.findAll("div" , {"class" : "property_value"}):
        year = link.string
    print (year)

search_year(1)
Reply
#2
Hello.

When you say it failed, that could be because there is a "Y" in the day (Tuesday).....no seriously without any tracestack or even wording from yourself no-one on here will have the mental telepathy to guess what has gone wrong, please try to supply the following with as much information as possible, your environment, your IDE (if using one), the python version you are using, how the error manifests itself or are you just basing it on the fact it didn't return any results, it could be security on the site you are crawling, the possibilities are endless :) - basically supply all the details that will allow members to make an informed suggestion to your problem.

Some things you can check for yourself like :-
  • Am i sure the methods i am calling are correct and of the correct case?
  • If I change my find all to something more simplistic does it return me anything at all - use the print function to display your variable etc
Thanks
Regards
-------- *
“Outside of a dog, a book is man's best friend. Inside of a dog it's too dark to read.”
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Django serving wrong template at the wrong address with malformed urls.py (redactor a Drone4four 2 2,526 Aug-17-2020, 01:09 PM
Last Post: Drone4four

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020