Python Forum
How can get url from JavaScript in Selenium (Python 3)?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How can get url from JavaScript in Selenium (Python 3)?
#1
I write parser for https://www.oddsportal.com

See this url - https://www.oddsportal.com/soccer/englan...d-nNNqedbR

I faced with next problem. Need get urls from this block
[Image: RBcgO.png]

How I can get all absolute urls from this menu?
If it is a long time to write all urls, can write only url from "Home/Away":"2nd Half", for example.

I think, this urls forming by JS (and Ajax mb) and I don't know, how I can walk on the urls.
[Image: 3TZ6P.png]
[Image: z5jOF.png]

def main(url):
    options = webdriver.ChromeOptions()
    options.add_argument('headless')
    driver = webdriver.Chrome(chrome_options=options)
    driver.get(url)

def get_url():
    base_url = 'https://www.oddsportal.com/soccer/england/premier-league/wolves-newcastle-utd-nNNqedbR'
    for i in ???:
        first_part = ???
        second_part = ???
        url = base_url + '#' + first_part + ';' + 'second_part'
        main(url)
Reply
#2
Your functions setup in is wrong,just drop functions for now,if you unsure how they work.
It's a really messy site to deal with,so not the easiest to start with if new to this.

To show a way to get values from first line,it can also easier to send browser.page_source to BS for parsing.
Turn of headless under testing.
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
import time

#--| Setup
chrome_options = Options()
#chrome_options.add_argument("--headless")
#chrome_options.add_argument('--disable-gpu')
#chrome_options.add_argument('--log-level=3')
browser = webdriver.Chrome(executable_path=r'C:\cmder\bin\chromedriver.exe')
#--| Parse or automation
browser.get('https://www.oddsportal.com/soccer/england/premier-league/wolves-newcastle-utd-nNNqedbR#1X2;4')
# Give source code to BeautifulSoup
soup = BeautifulSoup(browser.page_source, 'lxml')
time.sleep(3)
table_first_line = soup.select('#odds-data-table > div > table > tbody > tr:nth-of-type(1)')
print(table_first_line[0].text.strip())
browser.quit()
Get all values but need some clean up(white-space).
Output:
bet-at-home  2.052.404.8490.0%
Look at Web-scraping part-2.
Reply
#3
(Feb-03-2019, 07:33 PM)snippsat Wrote: Your functions setup in is wrong,just drop functions for now,if you unsure how they work.

Thanks for the answer. But I need to get a list of url addresses. I know, how to get text from cells...


And I need a function so that I can run it for each url in a for loop.
Reply
#4
(Feb-03-2019, 07:33 PM)snippsat Wrote: Your functions setup in is wrong,just drop functions for now,if you unsure how they work.
It's a really messy site to deal with,so not the easiest to start with if new to this.

To show a way to get values from first line,it can also easier to send browser.page_source to BS for parsing.

Hi. I wrote the code on the pure lxml and it works faster than yours.

Yes. It's a really messy site to deal with,so not the easiest to start with if new to this, many pitfalls, but then others will be easy.

page = browser.page_source
time.sleep(3)

doc = lxml.html.fromstring(page)
row = doc.cssselect("tr.lo")[0]
print(row.text_content().strip())
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Using Python request without selenium on html form with javascript onclick submit but eraosa 0 3,135 Jan-09-2021, 06:08 PM
Last Post: eraosa
  question about using javascript on python selenium Kai 1 1,852 Apr-12-2020, 04:28 AM
Last Post: Larz60+
  Scrapping javascript website with Selenium where pages randomly fail to load JuanJuan 14 7,056 Dec-27-2019, 12:32 PM
Last Post: JuanJuan
  Unable to access javaScript generated data with selenium and headless FireFox. pjn4 0 2,501 Aug-04-2019, 11:10 AM
Last Post: pjn4
  Python - Scrapy Javascript Pagination (next_page) Baggelhsk95 3 9,918 Oct-08-2018, 01:20 PM
Last Post: stranac
  scraping javascript websites with selenium DoctorEvil 1 3,314 Jun-08-2018, 06:40 PM
Last Post: DoctorEvil
  Error in Selenium: CRITICAL:root:Selenium module is not installed...Exiting program. AcszE 1 3,584 Nov-03-2017, 08:41 PM
Last Post: metulburr
  Python wsgi example: problem with javascript imonike 12 9,878 Jun-19-2017, 03:27 PM
Last Post: imonike
  selenium bypass javascript popup box metulburr 6 8,304 Jun-02-2017, 07:15 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020