Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Need some help with Selenium
#1
I have written the code below to scrape the data from the bet monitor.

When i run the code, i will get 18x the text: 'Today 13 Aug 20:00' . I would like to get the result with all the different dates in stead of just the same text 18 times. So the result i want:

" Today 13 Aug 20:00

Saturday 14 Aug 16:30

etc''

Can somebody help me to get all the dates instead of the same date many times?

Thanks!
from selenium import webdriver

url = 'https://www.betmonitor.com/odds-comparison/football/netherlands-eredivisie/10000060'

driver = webdriver.Chrome()
driver.get(url)

header = driver.find_element_by_id('content')

event = header.find_elements_by_class_name('league-event-new')

for details in event:
    datum = details.find_element_by_xpath('//div[@class="evtime"]').text
    print(datum) 
Reply
#2
Like this,and look setup.
The can eg run headless or other options.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from time import sleep

#--| Setup
options = Options()
options.add_argument("--headless")
driver = webdriver.Chrome(executable_path=r'C:\cmder\bin\chromedriver.exe', options=options)
#--| Parse or automation
url = "https://www.betmonitor.com/odds-comparison/football/netherlands-eredivisie/10000060"
driver.get(url)
time_play = driver.find_elements_by_css_selector("div.evtime")
for index, time_event in enumerate(time_play):
    print(time_play[index].text)
    print('-' * 10)
Output:
Today 14 Aug 18:45 ---------- Today 14 Aug 20:00 ---------- Today 14 Aug 21:00 ---------- Sunday 15 Aug 12:15 ---------- Sunday 15 Aug 14:30 ---------- ... ect
Reply
#3
That works for me! Thank you very much.
I was hoping that it was just a little modification in my script...
Reply
#4
To answer you code question in PM here,as we want the knowledge to be available on the forum for all.
WallieA Wrote:Is it easy to add also other lines which i want to scrap?
Like class names as teams and odds?

I have try it by myself but i got errors..

Thanks !
It's not hard but you have look at source code and try to understand the structure in eg Chrome/FireFox DevTools.
Try to get info for one event,if look at source so is that all in class="league-event-new".
So this line will get all and if look first element it will be like this.
event = driver.find_elements_by_css_selector("div.league-event-new")
Test.
>>> event
[<selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="a147ce8a-4e93-4abc-9544-9e71c18d4389")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="2c8db369-a8ff-4a90-a258-7b5ec678e5da")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="48482681-b645-4570-8d64-e1ff9e12a089")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="5c229435-dca3-41e7-8e2b-215e4d0e64a1")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="2213fa4d-c397-4422-a5d1-7ac3932d5d2c")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="8ebb073c-f525-4fe4-a93d-5c487680f302")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="1dafeb95-5553-4bbf-a0e9-f5ae3fa79c9f")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="91e0b62a-685c-4946-9cf6-819b6cce6150")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="6a4c75c8-7ff9-4b90-b50a-2cdf2303454f")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="7ac9212d-e42b-4df8-ad83-b0a5a5dd43e6")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="eae95338-75f4-4d65-9a26-577c486ed46e")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="ee561496-420b-4b4d-a7ff-f2ec9359d189")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="ec6a9491-a801-4cce-8cff-3f216001ccb4")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="bcbe06c3-8672-4be3-a654-197546b2f7a7")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="bc350602-d530-4276-aa97-9cf11e3a2077")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="f4e0306d-34df-4074-a70d-b09b5cf137f9")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="19b5afb6-3206-4e80-beae-149d245cd105")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="313861bb-50fd-4bd1-a5b7-6eeb1ba894d6")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="ed793e19-a5b7-4f29-aa6c-a93075f261ae")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="5762d704-7c0c-4170-9bbd-40a3d897273f")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="9c26cfe8-697d-4da2-876a-baa6c0cbdb8f")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="190d0053-bec2-47f7-9213-5b56bf2cf332")>,
 <selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="571ac12c-d4e1-4b78-82da-45712723f312")>]

>>> print(event[0].text)
Friday
20 Aug
20:00
Football · Netherlands · Netherlands Eredivisie
NEC - ZWOLLE
58 Bookmakers, 3905 odds
1 2.52
X 3.20
2 2.81
O 1.76
U 2.01
O/U 2.5
So it's all there date,names,teams,odds...ect

Can also copy CSS selector/XPath(right click over tag and copy) for Devtool to get a exact vault for a tag on site.
odds = driver.find_elements_by_css_selector("#content > div:nth-child(5) > div:nth-child(4)")
Test.
>>> odds
[<selenium.webdriver.remote.webelement.WebElement (session="4a5e97dcfedde6786f8a1605ae80a197", element="f6008c3c-6c39-4a82-8522-8d8ad49514e3")>]
>>> odds[0].text
'1 2.52\nX 3.20\n2 2.81'
>>> print(odds[0].text)
1 2.52
X 3.20
2 2.81
Reply
#5
Thanks for your replies Big Grin .

I am a little bit further now, but not at the point that i wish. I will try to get closer to my goal with your help Big Grin
Reply
#6
Excuse me, but i have again a question. I am trying for at least 10 hours to scrape the data from betmonitor, but everytime when i think i've got a good python script, it doesn't work Dodgy .

With the code below, i try to get a Excel file with 5 columns: date, league, match, 1x2odd (3 seperatie columns) and ouodd (2 seperate columns).

Can somebody please tell me what is wrong in the code below and why i don't get any data?

Thanks a lot!!!!

[Image: Data.png]


from selenium import webdriver
import time
import pandas as pd

url = 'https://www.betmonitor.com/odds-comparison/football/germany-bundesliga/10000090'
driver = webdriver.Chrome(executable_path='C:/webdrivers/chromedriver.exe')
evt_details = []

driver.get(url)
time.sleep(5)
evt_list = driver.find_elements_by_css_selector('div.league-event-new')

for evt_match in evt_list:
  # Getting match info
  evt_date = evt_match.find_elements_by_xpath('//div[@class="evtime"]')[0]
  evt_league = evt_match.find_elements_by_xpath('//div[@class="league"]')[0]
  evt_teams = evt_match.find_elements_by_xpath('//div[@class="teams"]')[0]
   for x in range(1, 20)
    evt_1x2odds = evt_match.find_elements_by_xpath('//*[@id="content"]/div[{x}]/div[4]"]')[0]
    evt_OUodds = evt_match.find_elements_by_xpath('//*[@id="content"]/div[{x}]/div[5]"]')[0]
# Saving match info
match_info = [evt_date.text, evt_league.text, evt_teams.text, evt_1x2odds.text, evt_OUodds.text]
# Saving into evt details
evt_details.append(match_info)
driver.quit()

evt_details_df = pd.DataFrame(evt_details)
evt_details_df.columns = ['date', 'league', 'teams', 'odds 1x2', 'odds OU2.5']
evt_details_df.to_csv('evt_details.csv', index=False)
Reply
#7
So from this line i would do some test my own way not using your code,do not loop before figure out the basic.
event = driver.find_elements_by_css_selector("div.league-event-new")
>>> data = event[0].text
>>> data = data.split('\n')
>>> data
['Friday',
 '20 Aug',
 '20:00',
 'Football · Netherlands · Netherlands Eredivisie',
 'NEC - ZWOLLE',
 '61 Bookmakers, 4329 odds',
 '1 2.50',
 'X 3.22',
 '2 2.78',
 'O 1.76',
 'U 2.00',
 'O/U 2.5']
>>> 
>>> data = list(zip(*[data[i::3] for i in range(3)]))
>>> data
[('Friday', '20 Aug', '20:00'),
 ('Football · Netherlands · Netherlands Eredivisie',
  'NEC - ZWOLLE',
  '61 Bookmakers, 4329 odds'),
 ('1 2.50', 'X 3.22', '2 2.78'),
 ('O 1.76', 'U 2.00', 'O/U 2.5')]
>>> 
>>> df = pd.DataFrame(data) 
>>> df = df.transpose()
>>> df
        0                                                1       2        3
0  Friday  Football · Netherlands · Netherlands Eredivisie  1 2.50   O 1.76
1  20 Aug                                     NEC - ZWOLLE  X 3.22   U 2.00
2   20:00                         61 Bookmakers, 4329 odds  2 2.78  O/U 2.5
>>> 
>>> df.rename(columns={0: "Date", 1: "Event", 2: "Odds_1", 3: "Odds_2"}, inplace=True)
>>> df
     Date                                            Event  Odds_1   Odds_2
0  Friday  Football · Netherlands · Netherlands Eredivisie  1 2.50   O 1.76
1  20 Aug                                     NEC - ZWOLLE  X 3.22   U 2.00
2   20:00                         61 Bookmakers, 4329 odds  2 2.78  O/U 2.5
So now have first row structure with column name(added) in same ways as shown on website.
Output:
Date Event Odds_1 Odds_2 0 Friday Football · Netherlands · Netherlands Eredivisie 1 2.50 O 1.76 1 20 Aug NEC - ZWOLLE X 3.22 U 2.00 2 20:00 61 Bookmakers, 4329 odds 2 2.78 O/U 2.5
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Error in Selenium: CRITICAL:root:Selenium module is not installed...Exiting program. AcszE 1 3,584 Nov-03-2017, 08:41 PM
Last Post: metulburr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020