Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
scraping video src fail
#6
(Jul-11-2021, 02:32 AM)jacklee26 Wrote: what if the HTML page has double html like this, I went to get but it is empty
Adding /source will break the XPath.
There is tag that makes this task different,which is iframe.
Also a common mistake is not given page time to load,i use time.sleep as first test,there is Waits that deal with this.

So to test code give,on real page may need to switch window browser.switch_to.frame(iframe).
I get source text from iframe,then can parse that text(is now just text not html) with BS to get tag wanted.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from time import sleep
from bs4 import BeautifulSoup

#--| Setup
options = Options()
#options.add_argument("--headless")
browser = webdriver.Chrome(executable_path=r'C:\cmder\bin\chromedriver.exe', options=options)
#--| Parse or automation
browser.get('file:///E:/div_code/scrape/local4.html')
sleep(3)
video_tag = browser.find_elements_by_xpath('//*[@id="allmyplayer"]')
print(video_tag)
# Send text html to BS for parse
soup = BeautifulSoup(video_tag[0].text, 'lxml')
print(soup.find('source').get('src', 'Not Found'))
Output:
[<selenium.webdriver.remote.webelement.WebElement (session="d0f2629448eb9fb9baabc6dc77342fb9", element="7cc91c06-2934-4243-9645-a334805ce2c4")>] https://vs02.520call.me/files/mp4/1/13cDq.m3u8?t=1625961526
Reply


Messages In This Thread
scraping video src fail - by jacklee26 - Jul-10-2021, 01:29 PM
RE: scraping video src fail - by snippsat - Jul-10-2021, 02:16 PM
RE: scraping video src fail - by jacklee26 - Jul-10-2021, 07:17 PM
RE: scraping video src fail - by snippsat - Jul-10-2021, 08:17 PM
RE: scraping video src fail - by jacklee26 - Jul-11-2021, 02:32 AM
RE: scraping video src fail - by snippsat - Jul-11-2021, 09:38 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Why does [root.destroy, exit()]) fail after pyinstaller? Rpi Edward_ 4 718 Oct-18-2023, 11:09 PM
Last Post: Edward_
  How to calculated how many fail in each site(s) in csv files SamLiu 4 1,383 Sep-26-2022, 06:28 AM
Last Post: SamLiu
  Imports that work with Python 3.8 fail with 3.9 and 3.10 4slam 1 2,691 Mar-11-2022, 01:50 PM
Last Post: snippsat
  [SOLVED] Why does regex fail cleaning line? Winfried 5 2,575 Aug-22-2021, 06:59 PM
Last Post: Winfried
  fail to upgrade pygobject jiapei100 0 3,128 Aug-16-2018, 10:32 AM
Last Post: jiapei100
  Fail to allocate bitmap rsbeesh 7 15,944 Dec-13-2017, 01:36 PM
Last Post: wavic

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020