Web Scraping in Python - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Web Scraping in Python (/thread-34695.html) |
Web Scraping in Python - phochka - Aug-22-2021 Hi, I am new user Need some assistance on web scraping, the url is (https://www.geny.com/reunions-courses-pmu?date=2021-08-20) I post my test. The response on my test is 'None' What is wrong, How can I have these response ? /reunions-courses-pmu/_d2021-08-20?#reunion2">Duindigt (Pays-Bas) /reunions-courses-pmu/_d2021-08-20?#reunion3">Fairview (Afrique du Sud) /reunions-courses-pmu/_d2021-08-20?#reunion4">La Teste-de-Buch /reunions-courses-pmu/_d2021-08-20?#reunion5">Clairefontaine-Deauville /reunions-courses-pmu/_d2021-08-20?#reunion6">Cagnes-sur-Mer</a> /reunions-courses-pmu/_d2021-08-20?#reunion7">York (Grande-Bretagne) /reunions-courses-pmu/_d2021-08-20?#reunion8">Divonne-les-Bains RE: Web Scraping in Python - snippsat - Aug-22-2021 Use code tag when post code. You have to find all a first.import requests from bs4 import BeautifulSoup url = 'https://www.geny.com/reunions-courses-pmu?date=2021-08-20' headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0', 'Accept': 'text/html,*/*', 'Accept-Language': 'en,en-US;q=0.7,en;q=0.3', 'X-Requested-With': 'XMLHttpRequest', 'Connection': 'keep-alive'} resp = requests.get(url, headers=headers) soup = BeautifulSoup(resp.text, 'lxml') # using find a = soup.find('div', {'class': 'yui-u liensReunion'}) #print(a.get('href')) all_a = a.find_all('a') # Like this >>> all_a [<a href="/reunions-courses-pmu/_d2021-08-20;jsessionid=1B631A1EE89E45D365AC1AAC3180F2F2?#reunion1">Cabourg</a>, <a href="/reunions-courses-pmu/_d2021-08-20;jsessionid=1B631A1EE89E45D365AC1AAC3180F2F2?#reunion2">Duindigt (Pays-Bas)</a>, <a href="/reunions-courses-pmu/_d2021-08-20;jsessionid=1B631A1EE89E45D365AC1AAC3180F2F2?#reunion3">Fairview (Afrique du Sud)</a>, <a href="/reunions-courses-pmu/_d2021-08-20;jsessionid=1B631A1EE89E45D365AC1AAC3180F2F2?#reunion4">La Teste-de-Buch</a>, <a href="/reunions-courses-pmu/_d2021-08-20;jsessionid=1B631A1EE89E45D365AC1AAC3180F2F2?#reunion5">Clairefontaine-Deauville</a>, <a href="/reunions-courses-pmu/_d2021-08-20;jsessionid=1B631A1EE89E45D365AC1AAC3180F2F2?#reunion6">Cagnes-sur-Mer</a>, <a href="/reunions-courses-pmu/_d2021-08-20;jsessionid=1B631A1EE89E45D365AC1AAC3180F2F2?#reunion7">York (Grande-Bretagne)</a>, <a href="/reunions-courses-pmu/_d2021-08-20;jsessionid=1B631A1EE89E45D365AC1AAC3180F2F2?#reunion8">Divonne-les-Bains</a>] >>> all_a[0].get('href', 'Not found') '/reunions-courses-pmu/_d2021-08-20;jsessionid=1B631A1EE89E45D365AC1AAC3180F2F2?#reunion1' >>> all_a[1].get('href', 'Not found') '/reunions-courses-pmu/_d2021-08-20;jsessionid=1B631A1EE89E45D365AC1AAC3180F2F2?#reunion2' >>> all_a[1].get('car', 'Not found') 'Not found' RE: Web Scraping in Python - phochka - Aug-22-2021 Hi, snippsat Thank's a lot, that what I need |