Python Forum
web crawler that retrieves data not stored in source code
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
web crawler that retrieves data not stored in source code
#4
If you inspect the element you are interpreting the source from the browsers eyes...and if its not there with python, then it means its javascript. You would have to get the source with selenium first before handing it off to BeaufitulSoup


for example
from bs4 import BeautifulSoup
from selenium import webdriver
import time
import os

url = 'http://www.publi24.ro/anunturi/locuri-de-munca/anunt/Echipa-Tehnician-Alpinist-Telecom/7b00667478616b51.html'

def setup():
    '''
    setup webdriver and create browser
    '''
    #https://chromedriver.storage.googleapis.com/index.html
    #https://chromedriver.storage.googleapis.com/index.html?path=2.25/ ##latest
    chromedriver = "/home/metulburr/chromedriver" #the path to the chromedriver
    os.environ["webdriver.chrome.driver"] = chromedriver
    browser = webdriver.Chrome(chromedriver)
    return browser
    
browser = setup()
browser.get(url) 
time.sleep(2)

soup = BeautifulSoup(browser.page_source, 'lxml')
tag = soup.find('span', {'add-view':'18230886'})
print(tag.text)
browser.quit()
Output:
$ python test.py 16
Although this will pop a browser up for a couple seconds. IF you want you can use a headless browser to keep it in the background.
Recommended Tutorials:
Reply


Messages In This Thread
RE: web crawler that retrieves data not stored in source code - by metulburr - Jan-05-2017, 02:53 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Hide source code from python process itself xmghe 2 1,884 Jan-27-2021, 04:04 PM
Last Post: xmghe
  Web Crawler help Mr_Mafia 2 1,899 Apr-04-2020, 07:20 PM
Last Post: Mr_Mafia
  scraping from a website that hides source code PIWI_Protein 1 1,972 Mar-27-2020, 05:08 PM
Last Post: Larz60+
  Web Crawler help takaa 39 27,282 Apr-26-2019, 12:14 PM
Last Post: stateitreal
  Python requests.get() returns broken source code instead of expected source code? FatalPythonError 3 3,733 Sep-21-2018, 02:46 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020