Python Forum
Mechanize and BeautifulSoup read not correct hours
Thread Rating:
  • 7 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Mechanize and BeautifulSoup read not correct hours
#1
Hi all.
I'm experiencing a problem while scraping information from this URL.

The problem arises because mechanize changes the hours while retrieving the html source code. Any hour has a delay of -1 hours. I think it might depend on some local configuration on my system (I live in Italy and the site might have another time zone).

This being said, I could not solve the problem and ask for some help :)

This is a brief working extract of my code
from __future__ import print_function

from bs4 import BeautifulSoup

import regex as re
import mechanize
from datetime import datetime

URL_PAGE = 'https://www.myfxbook.com/forex-economic-calendar'

# retrieve html code      
br = mechanize.Browser()
br.set_handle_robots(False)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]        
html_content = br.open(URL_PAGE).read()

# soup
soup = BeautifulSoup(html_content, "html.parser")

#regex for extraction
cal_row_re  = re.compile(r'^calRow.*')             # <-- name
date_re     = re.compile(r'\w+\s?\d+:\d+')         # <-- date

#extracting events
CalEvents = soup.find_all(id=cal_row_re)

for singleEvent in CalEvents:
    date = singleEvent.find(text=date_re).strip()
    eventName = singleEvent.find(class_='noUnderline').get_text().strip()
    print(date, eventName, sep = ';')
Thank you in advance
Reply


Messages In This Thread
Mechanize and BeautifulSoup read not correct hours - by vaeVictis - Jan-11-2019, 09:38 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Automating Captcha form submission with Mechanize Dexty 2 3,291 Aug-03-2021, 01:02 PM
Last Post: Dexty
  Web App That Request Data from Another Web Site every 12-hours jomonetta 15 10,019 Sep-26-2018, 04:19 PM
Last Post: snippsat
  Click on unusual class button using mechanize Ask Question Coto 1 3,844 Feb-18-2018, 07:27 AM
Last Post: metulburr
  Click on button with python mechanize torlkius 3 18,584 Jan-03-2018, 02:29 PM
Last Post: metulburr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020