Python Forum
Code scrape more than one time information
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Code scrape more than one time information
#1
I'm beginner in python and webscraping. My objectif was to scrape 30 reviews from a tripadvisor restaurant. But when I open the file I have 301 reviews, the 30 reviews are repeated more than five times. Could you tell me what is wrong?... What am I missing? ... This is my code :
with requests.Session() as s:
        for offset in range(10,40):
            url = f'https://www.tripadvisor.fr/Restaurant_Review-g187147-d947475-Reviews-or{offset}-Le_Bouclard-Paris_Ile_de_France.html'
            r = s.get(url)
            soup = bs(r.content, 'lxml')
            reviews = soup.select('.reviewSelector')
            ids = [review.get('data-reviewid') for review in reviews]
            r = s.post(
                    'https://www.tripadvisor.fr/OverlayWidgetAjax?Mode=EXPANDED_HOTEL_REVIEWS_RESP&metaReferer=',
                    data = {'reviews': ','.join(ids), 'contextChoice': 'DETAIL'},
                    headers = {'referer': r.url}
                    )
              
            soup = bs(r.content, 'lxml')
            if not offset:
                inf_rest_name = soup.select_one('.heading').text.replace("\n","").strip()
                rest_eclf = soup.select_one('.header_links a').text.strip()
  
            for review in soup.select('.reviewSelector'):
                name_client = review.select_one('.info_text > div:first-child').text.strip()
                date_rev_cl = review.select_one('.ratingDate')['title'].strip()
                titre_rev_cl = review.select_one('.noQuotes').text.strip()
                opinion_cl = review.select_one('.partial_entry').text.replace("\n","").strip()
                row = [f"{inf_rest_name}", f"{rest_eclf}", f"{name_client}", f"{date_rev_cl}" , f"{titre_rev_cl}", f"{opinion_cl}"]
                w.writerow(row)
I tried to change the variable review for opinion_cl, because I thought that it was the error, but it shows me the same 301 reviews. I will appreciate your help.
Reply


Messages In This Thread
Code scrape more than one time information - by Clnprof - Aug-25-2019, 12:57 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  How do I scrape profile information from Twitter People search results? asdad 0 800 Nov-29-2022, 10:25 AM
Last Post: asdad
  Assistance with running a few lines of code at an EXACT time nethatar 5 3,427 Feb-24-2021, 10:43 PM
Last Post: nilamo
  Stumped by my own code (ratio & epoch-time calculation). MvGulik 2 2,222 Dec-30-2020, 12:04 AM
Last Post: MvGulik
  Code taking too much time to process ErPipex 11 5,118 Nov-16-2020, 09:42 AM
Last Post: DeaD_EyE
  What is the run time complexity of this code and please explain? samlee916 2 2,378 Nov-06-2020, 02:37 PM
Last Post: deanhystad
  The count variable is giving me a hard time in this code D4isyy 2 2,038 Aug-09-2020, 10:32 PM
Last Post: bowlofred
  Having a hard time combining two parts of code. Coozeki 6 3,247 May-10-2020, 06:50 AM
Last Post: Coozeki
  Parsing Date/Time from Metar Reports with 6 hourly weather information Lawrence 0 2,415 May-03-2020, 08:15 PM
Last Post: Lawrence
  How to avoid open and save a url every time I run code davidm 4 2,741 Mar-03-2020, 10:37 PM
Last Post: snippsat
  Help to reduce time to execute the code prakash52kar 1 2,293 Oct-14-2019, 10:56 AM
Last Post: scidam

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020