Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
BeautifulSoup help !
#1
So I've just started up with python and an assignment was given to me by a company as an recruitment task.

I need to web scrap the coupons of all the websites available on www.couponraja.in and export it to csv format. 
The details which I need to be present in the csv are the coupon title , vendor , validity , description/detail , url to the vendor , image url of the coupon.

I have gone through many tutorials on beautifulsoup and have a beginners understanding of using it. Wrote a code as well , but the problem Im facing here is when i collect info from the divs which contains all those info , Im getting it in with all the html tags and the info is clustered. 

Code m using :
import requests
from bs4 import BeautifulSoup

url = "https://www.couponraja.in/amazon"
r = requests.get(url)
soup = BeautifulSoup(r.content)
g_data = soup.find_all("div", {"class": "nw-offrtxt"})
for item in g_data:
    print item.contents
also will need help on how to export the info to csv format , I just know I need to import csv then write the information to a csv file.
But not getting through on how to achieve that.

Any help will be appreciated.
Reply
#2
Hello! What is in g_data? Give us portion of the output
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#3
(Oct-06-2016, 09:48 AM)wavic Wrote: Hello! What is in g_data? Give us portion of the output


g_data has the information in the nw-offrtxt div of the web page.


[Image: Capture.jpg]


the palce i have circles includes all the information Im seeking .. on the other side is the div "nw-offrtxt" which contains the information . g_data has that information.

Im a noob here so I might me a bit off track here.
Reply
#4
I was meaning the the output from the print(g_data[0]).
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020