Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
error installing requests_html
#1
Hi, Trying to run: "pip install requests_html" and getting this error
Executed ommand: pip install libxml2-python3
Error occurred: Non-zero exit code (1)

Running Python 3.8 on Windows 10 and want to use BeautifulSoup for dynamic javascript sites
Any ideas how I can solve this, thanks for any help
Reply
#2
I did try requests_html install for 3.8 win-10,and no error on install(Install tutorial).
I tested it before was not stable then,now also lack updates.
Users in Issue tracker.
Quote:So I'm guessing that this project is abandoned.

Using Selenium and web-driver(Chrome or Firefox) is stable solution that works for most cases.
Example in your previous Thread.
Yahoo Finance is not as easy site to parse,here how it look with tool mention over.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import time

#--| Setup
options = Options()
#options.add_argument("--headless")
#options.add_argument("--window-size=1980,1020")
#options.add_argument('--disable-gpu')
browser = webdriver.Chrome(executable_path=r'chromedriver.exe', options=options)
#--| Parse or automation
url = 'https://finance.yahoo.com/quote/BARC.L/key-statistics?p=BARC.L'
browser.get(url)
soup = BeautifulSoup(browser.page_source, 'lxml')
accept = browser.find_elements_by_xpath('//*[@id="consent-page"]/div/div/div/div[3]/div/form/button[1]')
accept[0].click()
time.sleep(2)
main_vaule = browser.find_elements_by_xpath('//*[@id="quote-header-info"]/div[3]/div/div/span[1]')
print(main_vaule[0].text)
Output:
138.54
Comment out --headless and the browser will not load.
If new to this should load browser,then can see stuff happens like push button..ect.
Reply
#3
Thanks Snippsat
Super helpfull
Did succeed to install requests_html using
pip3 install requests_html
from the command prompt
Reply
#4
Hi Snippsat, tried running your code and get this error
[inline]
soup = BeautifulSoup(browser.page_source, 'lxml')
File "C:\Users\david\PycharmProjects\yahoo-finance\venv\lib\site-packages\bs4\__init__.py", line 225, in __init__
raise FeatureNotFound(
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
[/inline]
Tried installing lxml package, but get error:
ERROR: b"'xslt-config' is not recognized as an internal or external command,\r\noperable program or batch file.\r\n"
Any ideas?
Thanks
Reply
#5
(Mar-06-2020, 12:01 AM)davidm Wrote: Tried installing lxml package, but get error:
Use gohlke lxml if pip install lxml fail.
Example:
pip install lxml‑4.5.0‑cp38‑cp38‑win_amd64.whl
Can also use an other parser eg the build in would be:
soup = BeautifulSoup(browser.page_source, 'html.parser')
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Requests_HTML not getting all data on Amazon aaander 1 1,315 Nov-19-2022, 02:09 AM
Last Post: aaander

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020