Python Forum

Hello there,
I am working on my first Webscraper and have a problem with the pagination of the Website i want to crawl.

This is my code so far:

import bs4 as bs
import urllib.request
import pprint
import time

source = urllib.request.urlopen(

    'https://radiobochum.radiosparbox.de/'

).read()
soup = bs.BeautifulSoup(source, 'lxml')


Anbieter = []
for anbieter in soup.select('h2.artist'):
    Anbieter.append(str(anbieter.text[1:]))

Anzahl = []
for anzahlTi in soup.select('span.stock'):
    Anzahl.append(str(anzahlTi.text))

westfunk = {}
for f, b in zip(Anbieter, Anzahl):
    westfunk[f] = b

print(time.strftime("Statistiken der Westfunk am: " "%d.%m.%Y %H:%M:%S""\n"))
print("__________________________________________________________________________")

for anbieter_out, anzahl_out in westfunk.items():
    print(anbieter_out + ":" + anzahl_out)
 HERE

I know it's not the best but i hope you can help me with my problem.

Thank you so much in advance,
greetings from Germany. Smile

You can deal with Pagination to some extent using just Beautiful soup, by following links, but if any JavaScript is involved, you will probably fail. A better solution is to use selinium, see: http://seleniumhome.blogspot.com/2013/07...using.html

Try looking at the developers' pages. They have blocks or articles in which they give explanations

prejni

Larz60+

alekson