I want to scrap a tor site. - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Networking (https://python-forum.io/forum-12.html) +--- Thread: I want to scrap a tor site. (/thread-34926.html) |
I want to scrap a tor site. - Blue Dog - Sep-15-2021 Hi I been working no this for over a month. I can get some data from this site with requests. I can't get any type of a reply with BeautifulSoup. here is my code with requests: import socket import socks import urllib2 import requests from bs4 import BeautifulSoup ipcheck_url = 'http://checkip.amazonaws.com/' # Actual IP. print(urllib2.urlopen(ipcheck_url).read()) # Tor IP. socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, '127.0.0.1', 9150) socket.socket = socks.socksocket print(urllib2.urlopen(ipcheck_url).read()) url = 'http://darkzzx4avcsuofgfez5zq75cqc4mprjvfqywo45dfcaxrwqg6qrlfid.onion/' r = requests.get(url) print(r.url) print r.encoding print r.raw print r.status_code print r.headers print r.urlHere is my code with soup: import socket import socks import urllib2 import requests from bs4 import BeautifulSoup ipcheck_url = 'http://checkip.amazonaws.com/' # Actual IP. print(urllib2.urlopen(ipcheck_url).read()) # Tor IP. socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, '127.0.0.1', 9150) socket.socket = socks.socksocket print(urllib2.urlopen(ipcheck_url).read()) soup = BeautifulSoup(requests.get("http://darkzzx4avcsuofgfez5zq75cqc4mprjvfqywo45dfcaxrwqg6qrlfid.onion/").text, 'lxml') print soupNow am I doing something dumb or for some dumb thing soup does not work on onion sites. I hope some one can help with this. I have a lot of work to do on the dark web. Thank you renny RE: I want to scrap a tor site. - Larz60+ - Sep-15-2021 I'd break up the statement (line 20) then check requests.status_code to see if it is 200 (success). The URL may not be correct. RE: I want to scrap a tor site. - Blue Dog - Sep-15-2021 I did have a status check in the code I did break the line up. try it both way. I been working on this for a month, their not much that I have not tryed RE: I want to scrap a tor site. - Blue Dog - Sep-22-2021 Thanks for all the help, I got it working great. I am in scraper haven. RE: I want to scrap a tor site. - Blue Dog - Sep-26-2021 Let me ask you this: does the dark web site have a ip address or just the name of the site> I am trying to convert the name like 2635/6.onion to some type ip address like this 122.34.127.90. So far I have not been able to do it. Thank you Renny |