Jan-30-2018, 01:39 PM
(Jan-29-2018, 07:21 AM)metulburr Wrote:(Jan-29-2018, 07:05 AM)sumandas89 Wrote: can I enter into webpages using selenium and then scrape data from there using beautifulsoup?Yes if you need to get past javascript, you can use selenium to get the full page content and pass it to BS.
from selenium import webdriver driver = webdriver.Firefox() driver.get(WEBSITE) #delay of some kind wait for load time.sleep(3) or selenium wait for an element to be visible soup = BeautifulSoup(driver.page_source, 'html')However selenium has methods to get navigate HTML, as you will need it to get past multiple javascript pages/mouse clicks. So it depends really on whether you need BS after already using selenium.
(Jan-29-2018, 07:05 AM)sumandas89 Wrote: I observed that beautifulsoup never work on those pages needs login, so I need to login first to that website using seleniumYou can login to website with requests module and saving cookies, etc. Selenium is not required to login to a website unless it has javascript.
I seen that this solution sometimes doesn't work. It happens that some contents are available and in the pages but not available in the page source though data are available in the web pages. This behaviour I seen particularly in case of facebook and found no solution for it.