Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Only print new data/links
#1
I'm coding a page monitor to grab the link of new items on nike.com, but I'm not sure how to have python only return the link if the item is brand new (having just been uploaded to the site). What I have coded currently prints the link of the most recent item, but that link has been on the page for days. Again, I just want Python to return the link of a new item, not an item that has been on the site for days, weeks, etc. Any help would be great. Here is the code for my page monitor so far...

import requests
from bs4 import BeautifulSoup
import time
import json

headers = {
    'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36'
}

while True:  
    def item_finder():
        source = requests.get('https://www.nike.com/launch/', headers=headers).text
        soup = BeautifulSoup(source, 'lxml')
        card = soup.find('figure', class_='item ncss-col-sm-12 ncss-col-md-6 ncss-col-lg-4 va-sm-t pb2-sm pb4-md prl0-sm prl2-md ')
        card_data = "https://nike.com" + card.a.get('href')
        print(card_data)
Reply
#2
If there's no date in the page source anywhere, then the only way you can know if it's new, is if you yourself keep track of what's old. Either a text file, or a small database, where you can list urls you've seen before.


Unrelated, but how many functions named item_finder do you need?

(May-24-2018, 07:09 PM)snifferprime Wrote:
while True:  
    def item_finder():
Reply
#3
(May-24-2018, 08:04 PM)nilamo Wrote: If there's no date in the page source anywhere, then the only way you can know if it's new, is if you yourself keep track of what's old. Either a text file, or a small database, where you can list urls you've seen before. Unrelated, but how many functions named item_finder do you need?
(May-24-2018, 07:09 PM)snifferprime Wrote:
while True: def item_finder():
I need all the item_finder functions ;) thanks for the reply! I have all current links written to a text file then had the file read and if card_data was not in the file, it will print it. so far so good.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  webscrapping links and then enter those links to scrape data kirito85 2 3,147 Jun-13-2019, 02:23 AM
Last Post: kirito85
  Unable to print data while looping through list in csv for webscraping - Python Prince_Bhatia 1 3,476 Oct-04-2017, 11:18 AM
Last Post: wavic

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020