the easy way to see if you can use pip, is to goto https://pypi.org/ and search for the package by name.
If the package is there, you can use pip.
If the package is there, you can use pip.
Installing and running a python web scraping app from github to a windows 8.1 system
|
the easy way to see if you can use pip, is to goto https://pypi.org/ and search for the package by name.
If the package is there, you can use pip.
Sep-25-2018, 08:29 PM
There's a requirements.txt file: https://github.com/tomslee/airbnb-data-c...ements.txt
This is a file which can be generated by pip (the python package manager): https://pip.pypa.io/en/stable/reference/pip_freeze/ Installing all the dependencies is normally automatic (through a setup.py file as previously mentioned), but since this author chose not to do that, you can still do it yourself fairly easily, with: pip install -r requirements.txt
Quote:you can still do it yourself fairly easily, with: pip install -r requirements.txtHi, tnx very much about this I didn't know this trick. However the process stops itself when it's not able to find a suitable package version to install. It happens with anaconda I was manually installing all the libs. The weird thing is that when I arrived at nb-anacondacloud on the requirement.txt list, I realized I need to install that with conda. I installed miniconda3 in order to perform the command conda install -c conda-forge nb_anacondacloud But I'm behind a proxy and I don't know how to pass proxy settings in conda commands. I read about a .condarc file but I didn't hover the hump. Then I tried using another internet connection (proxy free) and finally I installed nb-conda requirements. What puzzle me is that (I hope I remember the exact message appeared) the installing process downgraded(what means?) some of the precedent installed packages and in some case changed some package version. All this process starts to resamble me Penelope's canvas
Hi to all ,
finally I managed to install all libraries for this github project and when I type airbnb I get all command options usage: airbnb.py [options] ...but I still have to configure postgresql database. All I have done is installing postgre sql ver 10 on my system. Tom Slee in his help file says: Quote:Installing and upgrading the database schemaI'm not able to find the schema_current.sql file and of course when I check the database connection with: python airbnb.py -dbp I get this: Could someone help me in configuring psql database? p.s. here is my airbnb config file if someone wants to check it: https://drive.google.com/open?id=1_psMrQ...r8_kAvobyh Quote:Could someone help me in configuring psql database? In particular, in the gitub folder there is a postgresql folder. Do I have to copy these files in my postgresql installation? and if so, where? And where do I have to run this code?: psql --user airbnb airbnb < postgresql/schema_current.sql
airbnb is not an executable program, it is a 'package' that you 'import' into your code
and write code around to 'call' methods from the package library (or in this case an api). As I pointed out to you already, go to the github page: https://github.com/nderkach/airbnb-python and scroll down to the section named 'Usage' This will show you how to use the package!
Oct-04-2018, 06:29 AM
Hi to all,
I manage to run the python web scraping package, even the postgresql part, but whe I run it I get this msg: Even after other attempts and behind a vpn I get the same msg. Is there something I can do, perhaps editing the content of the following file and the user agent list? https://drive.google.com/open?id=1jHmu8k...mBMqxfYOfv Pls don't be shy to help a little Larz60+ William bowed. “You are wise also when you are severe. It shall be as you wish.” “If ever I were wise, it would be because I know how to be severe,” the abbot answered. ...from Umberto Eco The Name of the Rose.
Oct-04-2018, 11:03 AM
Some sites require you to provide a header (user-agent) in your request, and may refuse your connection otherwise.
you can find your user-agent by googleing 'my user agent', it will appear at top of page, then set up in dictionary like: user_agent = { 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:60.0) ' 'Gecko/20100101 Firefox/60.0 AppleWebKit/537.36 (KHTML, ' 'like Gecko) Chrome/68.0.3440.75 Safari/537.36'}don't know what you are using to get page, but with requests it would be: response = requests.get(url, headers=user_agent)Also make sure you are sleeping for a few seconds after each request as hammering on a site is another reason for refusal. from time import sleep ... response = requests.get(url, headers=user_agent) sleep(2)
Hi and tnx for Your answer. I checked both user agent and sleep settings and I think they are ok (sleep time set to 5 sec). When I run the package I get a "WARNING No proxy_list in airbnb.config: not using proxies" (I'm not behind a corporate proxy when I check the package).
Tom Slee says this about proxy in the help file: Quote:Sometimes the Airbnb site refuses repeated requests. I run the script using a number of proxy IP addresses to avoid being turned away, and that costs money. I am afraid that I cannot help in finding or working with proxy IP services.I have a vpn account and I tried to run the package behind 3 or four different ip addresses (one at a time) with the same result. One thing I would like to understand is this. When someone buys a proxy service to run a crawler (I've seen a bunch of offers on internet), is that a different service with respect to the standard vpn that anonymize the navigation? Is there something other I can try apart from buying proxy service? However the next thing I'll do is to try to run the package from another pc and internet connection. |
|