Python Forum
Selenium + Aliexpress - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Selenium + Aliexpress (/thread-27506.html)



Selenium + Aliexpress - alexkorn - Jun-09-2020

Python 3.8
I need to parse the product listing page from Aliexpress. The page has endless scrolling, so BeautifulSoup, if I understand correctly, will not work. I use Selenium, but it loads a page with a description of the goods in English and prices in dollars, and the list of goods is not the same as in Ali in Russian. The interface is in Russian.
How to make Selenium upload a list of goods with a Russian description in rubles?

....
URL = 'https://flashdeals.aliexpress.com/ru.htm'
browser = webdriver.Chrome()
browser.get(URL)
html = browser.page_source
print(html)
....



RE: Selenium + Aliexpress - Knight18 - Jun-09-2020

I'm going to point out that there is another Web Scraping library, Scrapy. It's the more "Advanced" web scraper when compared to beautiful soup. Although it's harder to learn. Might be a better idea to use it.


RE: Selenium + Aliexpress - alexkorn - Jun-09-2020

I have a very small project, from the page I need to get only 10 products. I think Scrapy is like shooting a sparrow from a tank.


RE: Selenium + Aliexpress - alexkorn - Jun-09-2020

(Jun-09-2020, 07:17 AM)Knight18 Wrote: I'm going to point out that there is another Web Scraping library, Scrapy. It's the more "Advanced" web scraper when compared to beautiful soup. Although it's harder to learn. Might be a better idea to use it.

I am trying to install Scrapy.

ERROR: Command errored out with exit status 1: 'F:\Python38\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\ccc\\AppData\\Local\\Temp\\pip-install-ngotb8u1\\Twisted\\setup.py'"'"'; __file__='"'"'C:\\Users\\ccc\\AppData\\Local\\Temp\\pip-install-ngotb8u1\\Twisted\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\ccc\AppData\Local\Temp\pip-record-rgt32hb0\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\Users\ccc\AppData\Roaming\Python\Python38\Include\Twisted' Check the logs for full command output.

How to fix it?


RE: Selenium + Aliexpress - snippsat - Jun-09-2020

(Jun-09-2020, 04:48 PM)alexkorn Wrote: I am trying to install Scrapy.
Use Twisted wheel from Gohlke.
pip install Twisted‑20.3.0‑cp38‑cp38‑win_amd64.whl
pip install scrapy
(Jun-09-2020, 07:49 AM)alexkorn Wrote: I have a very small project, from the page I need to get only 10 products. I think Scrapy is like shooting a sparrow from a tank.
Yes it can complicate stuff if not need so much,
and Scrapy will not do JavaScript stuff like scrolling by default may need to use scrapy-splash or integrate Selenium.
If use Selenium alone the scrolling can be done bye this command.
browser.execute_script("window.scrollTo(0, 100000);")



RE: Selenium + Aliexpress - alexkorn - Jun-09-2020

Use Twisted wheel from Gohlke.
pip install Twisted‑20.3.0‑cp38‑cp38‑win_amd64.whl
pip install scrapy
pip install Twisted-20.3.0-cp38-cp38-win_amd64.whl
WARNING: Requirement 'Twisted-20.3.0-cp38-cp38-win_amd64.whl' looks like a filen
ame, but the file does not exist
ERROR: Twisted-20.3.0-cp38-cp38-win_amd64.whl is not a supported wheel on this p
latform.



RE: Selenium + Aliexpress - snippsat - Jun-09-2020

Then you have 32-bit Python version.
pip install Twisted‑20.3.0‑cp38‑cp38‑win32.whl



RE: Selenium + Aliexpress - alexkorn - Jun-09-2020

(Jun-09-2020, 06:50 PM)snippsat Wrote: Then you have 32-bit Python version.
pip install Twisted‑20.3.0‑cp38‑cp38‑win32.whl

Thanks!