Python Forum
Scrapy Picking What to Output Href or Img
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Scrapy Picking What to Output Href or Img
#1
Hi again guys,

Actually two things I want to ask help for (All of this are just inside the 'a' tag):

First one is:
I want to have a few items that I would like to output on a csv file.
href
inner-text and/or text
rel='no-follow'. And how would I have it write to a csv file if it did not see this then print to a csv 'do-follow'
img tag, how can I print or output 'img' if the spider see the a-link and see it's a 'img' type link?
All of these are inside the 'item'.

Second one is:
I usually see scraping things and using the 'item ='. Can I do a conditional like ifElse and the others since it's inside a define function?

To give an example:
Quote:<a href="http://myexampledomain.com"><img src='/example.jpg' alt='my inner text'>

How can I dissect and output each on my csv file?
href
img (output text on my csv is 'img')
alt or inner text

Sorry if these isn't just one question.

Many thanks for the help and enlightenment. Smile
Reply
#2
It would be a very big help if anyone can help me out?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Extract Href URL and Text From List knight2000 2 8,962 Jul-08-2021, 12:53 PM
Last Post: knight2000
  BeautifulSoup pagination using href rhat398 1 2,402 Jun-30-2021, 10:55 AM
Last Post: snippsat
  Accessing a data-phone tag from an href KatMac 1 2,886 Apr-27-2021, 06:18 PM
Last Post: buran
  How to get the href value of a specific word in the html code julio2000 2 3,201 Mar-05-2020, 07:50 PM
Last Post: julio2000
  Web Scraping on href text Superzaffo 11 7,341 Nov-16-2019, 10:52 AM
Last Post: Superzaffo
  Flask - Opening second page via href is failing - This site can’t be reached rafiPython1 2 5,475 Apr-11-2018, 08:41 AM
Last Post: rafiPython1
  Scrapy-cut: Advanced Cookiecutter Scrapy Templating scriptso 2 4,653 Feb-02-2017, 07:57 PM
Last Post: scriptso

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020