Python Forum
BeautifulSoup pagination using href
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
BeautifulSoup pagination using href
#1
I am trying to scrape all thee events from https://www.onthisday.com/events/february/5 I am getting all the events from first page.How can I get other events from the second page and merge into one list?

Right now I tried to catch the next page link and parse it but it didn't work still getting the results from first page.

Here is my code:

from typing import List
import requests as _requests
import bs4 as _bs4

def _generate_url(month: str, day: int) -> str:
    url = f'https://www.onthisday.com/events/{month}/{day}'
    return url

def _get_page(url: str) -> _bs4.BeautifulSoup:
    _page = _requests.get(url)
    soup = _bs4.BeautifulSoup(_page.content, 'html.parser')
    return soup

def events_of_the_day(month: str, day: int) -> List[str]:
    """
    Return the events of a given day
    """
    
    url = _generate_url(month, day)
    page = _get_page(url)
    next_link = page.select_one("a.pag__next")
    raw_events = [event.text for event in page.select("li.event")]
    if next_link:
        next_url = 'https://www.onthisday.com/events'+next_link['href']
        page_next = _get_page(next_url)
        for eve in page_next.select("li.event"):
            print(eve.text)
    
    #print(raw_events)
    

events_of_the_day("february", 5)
Note:

Some pages contains the next page and some don't so I am looking to handle both the situations.
Reply


Messages In This Thread
BeautifulSoup pagination using href - by rhat398 - Jun-29-2021, 10:22 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Extract Href URL and Text From List knight2000 2 22,614 Jul-08-2021, 12:53 PM
Last Post: knight2000
  Accessing a data-phone tag from an href KatMac 1 3,727 Apr-27-2021, 06:18 PM
Last Post: buran
  Python beautifulsoup pagination error The61 5 4,651 Apr-09-2020, 09:17 PM
Last Post: Larz60+
  How to get the href value of a specific word in the html code julio2000 2 4,537 Mar-05-2020, 07:50 PM
Last Post: julio2000
  Pagination prejni 2 3,139 Nov-18-2019, 10:45 AM
Last Post: alekson
  Scrapy Javascript Pagination (next_page) nazmulfinance 2 3,973 Nov-18-2019, 01:01 AM
Last Post: nazmulfinance
  Web Scraping on href text Superzaffo 11 9,879 Nov-16-2019, 10:52 AM
Last Post: Superzaffo
  pagination for non standarded pages zarize 12 8,427 Sep-02-2019, 12:35 PM
Last Post: zarize
  Python - Scrapy Javascript Pagination (next_page) Baggelhsk95 3 11,570 Oct-08-2018, 01:20 PM
Last Post: stranac
  Scrapy Picking What to Output Href or Img soothsayerpg 1 3,331 Aug-02-2018, 10:59 AM
Last Post: soothsayerpg

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020