Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
error in code web scraping
#1
Hi all,
I tried to do web scraping with webdriver from selenium (chromedriver) and BeautifulSoup.
Here I show you my code:
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
driver = webdriver.Chrome("/usr/local/bin/chromedriver")
products=[] #List to store name of the product
prices=[] #List to store price of the product
ratings=[] #List to store rating of the product
driver.get("https://www.flipkart.com/search?q=laptop&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=off&as=off")
content = driver.page_source
soup = BeautifulSoup(content)
for a in soup.findAll('a',href=True, attrs={'class':'_31qSD5'}):
    name=a.find('div', attrs={'class':'_3wU53n'})
    price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
    rating=a.find('div', attrs={'class':'hGSR34 _2beYZw'})
products.append(name.text)
prices.append(price.text)
ratings.append(rating.text) #line 25
df = pd.DataFrame({'Product Name':products,'Price':prices,'Rating':ratings}) 
df.to_csv('products.csv', index=False, encoding='utf-8')
But I obtain this error:
Error:
AttributeError: 'NoneType' object has no attribute 'text'
I don't understand why.
Which attribute I have to put in the append function?

Best regards.
Alexis
Reply
#2
It means that one or more of the elements that you are trying to find here (not sure from the incomplete error traceback):
    name=a.find('div', attrs={'class':'_3wU53n'})
    price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
    rating=a.find('div', attrs={'class':'hGSR34 _2beYZw'})
do not exist -> meaning they are NoneType, and then when you are trying to access attribute text you get the error.

You should probably double check the class names used.
Reply
#3
Hi mlieqo,
Thank you for your answer.
I tried the code without the attribute "text":
products.append(name)
prices.append(price)
ratings.append(rating) 
I haven't anymore the error, but i don't obtain nothing in the file products.csv

Do you think I have to put this code
products.append(name)
prices.append(price)
ratings.append(rating) 
in the for loop?

Yours.
Alexis
Reply
#4
I think you've misunderstood what you've been told. find will return some object and you'll need to use the text attribute to, well, get the text from it. In your case, find is returning None as an element matching the criteria you specified (e.g. a "div" element with a class of "_3wU53n" could not be found. So, None is going to be assigned to your variable and then when you try and access attributes or call methods on that, the same kind of problem will occur.

As mlieqo suggests, you should check the criteria you're using to find the elements: are you using the right tag and class names?
Reply
#5
As mention you most check better that name of class is correct,it is only hGSR34 and not hGSR34 _2beYZw.
Also not all product has a rating so need a fix for this.
Most specify which parser that BS shall use or get a Warning message,preferably use lxml.
As you have loop move append lines into the loop,or only get first product in the lists.
soup = BeautifulSoup(browser.page_source, 'lxml')
products = [] #List to store name of the product
prices = [] #List to store price of the product
ratings = [] #List to store rating of the product
browser.get("https://www.flipkart.com/search?q=laptop&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=off&as=off")
content = browser.page_source
soup = BeautifulSoup(content, 'lxml')
for a in soup.findAll('a', href=True, class_="_31qSD5"):
    name = a.find('div', class_="_3wU53n")
    price = a.find('div', class_="_1vC4OE _2rQ-NK")
    rating = a.find('div',class_="hGSR34")
    products.append(name.text)
    prices.append(price.text)
    try:
        ratings.append(rating.text)
    except AttributeError:
        pass
[>>> ratings
['4.2',
 '4.4',
 '4.5',
 '4.3',
 '4.4',
 '4',
 '4.2',
 '3.7',
 '4.6',
 '4.4',
 '4.5',
 '4',
 '4.5',
 '4.4',
 '4.7',
 '4',
 '4.5',
 '4.6',
 '4.2',
 '4.6',
 '4.3',
 '4.3',
 '5']
>>> prices
['₹39,990',
 '₹1,20,990',
 '₹35,990',
 '₹56,990',
 '₹52,990',
 '₹59,990',
 '₹32,990',
 '₹39,990',
 '₹43,990',
 '₹52,990',
 '₹59,990',
 '₹60,990',
 '₹35,990',
 '₹39,990',
 '₹38,990',
 '₹73,990',
 '₹24,990',
 '₹55,990',
 '₹69,990',
 '₹1,01,990',
 '₹59,990',
 '₹54,990',
 '₹61,990',
 '₹75,990']
Reply
#6
Hi all,
Thank you for yours answers.
That is much clearer for me now.
It is right, I was wrong with the tag of rating.
Thank you snippsat for your solution.
I just have to modify a bit the code because in the code:
for a in soup.findAll('a',href=True, attrs={'class':'_31qSD5'}):
    name=a.find('div', attrs={'class':'_3wU53n'})
    price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
    rating=a.find('div', attrs={'class':'hGSR34'})
    products.append(name.text)
    prices.append(price.text)
    try:
        ratings.append(rating.text)
    except AttributeError:
        pass 
df = pd.DataFrame({'Product Name':products,'Price':prices,'Rating':ratings}) 
df.to_csv('products.csv', index=False, encoding='utf-8')
if i run with "pass", i can't create my CSV file after because the arrays haven't the same length
So i can replace "pass" by a None value, like that:
for a in soup.findAll('a',href=True, attrs={'class':'_31qSD5'}):
    name=a.find('div', attrs={'class':'_3wU53n'})
    price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
    rating=a.find('div', attrs={'class':'hGSR34'})
    products.append(name.text)
    prices.append(price.text)
    try:
        ratings.append(rating.text)
    except AttributeError:
        None 
df = pd.DataFrame({'Product Name':products,'Price':prices,'Rating':ratings}) 
df.to_csv('products.csv', index=False, encoding='utf-8')
Best regards.

Alexis
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  I am scraping a web page but got an Error Sarmad54 3 1,417 Mar-02-2023, 08:20 PM
Last Post: Sarmad54
  scraping code misses listings kolarmi19 0 1,002 Jan-27-2023, 10:00 AM
Last Post: kolarmi19
  Code Help, web scraping non uniform lists(ul) luke_m 4 3,280 Apr-22-2021, 05:16 PM
Last Post: luke_m
  scraping code nexuz89 0 1,494 Sep-28-2020, 12:16 PM
Last Post: nexuz89
  In need of web scraping code! kolbyng 1 1,720 Sep-21-2020, 06:02 AM
Last Post: buran
  error zomato scraping data syxzetenz 3 3,317 Jun-23-2020, 08:53 PM
Last Post: Gribouillis
  scraping from a website that hides source code PIWI_Protein 1 1,938 Mar-27-2020, 05:08 PM
Last Post: Larz60+
  Web scraping error jithin123 0 2,387 Mar-22-2020, 08:13 PM
Last Post: jithin123
  Web Scraping Error : Not getting expected result adminravi 4 2,332 Oct-08-2019, 09:53 AM
Last Post: snippsat
  Scraping data saving to DB error with Cursor cubangt 3 2,738 May-20-2019, 08:30 PM
Last Post: Yoriz

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020