Python Forum
Pass multiple items from one parse to another using Scrapy
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pass multiple items from one parse to another using Scrapy
#1
Dear Members,

I want to pass both name and link_only from parse to parse_country. I know we use the meta function to transfer. Could you please tell me how can I change the meta function below to transfer both items? With the following code, I get blank cells in the CSV file for the link_only column.

In my actual project, I have 7/8 items to transfer from one parse to another. Please suggest me how to make it

import scrapy
import logging
 
 
class CountriesSpider(scrapy.Spider):
    name = 'countries'
    allowed_domains = ['www.worldometers.info']
    start_urls = ['https://www.worldometers.info/world-population/population-by-country/']
 
    def parse(self, response):
        countries=response.xpath("//td/a")
        for country in countries:
            name=country.xpath(".//text()").get()
            link_only=country.xpath(".//@href/text()").get()
            link=country.xpath(".//@href").get()
 
            yield response.follow(url=link, callback=self.parse_country, meta={'country_name': name, 'link_only':link_only})
            
 
    def parse_country(self, response):
        name=response.request.meta['country_name']
        link_only=response.request.meta['link_only']
        rows = response.xpath("(//table[@class='table table-striped table-bordered table-hover table-condensed table-list'])[1]/tbody/tr")
        for row in rows:
            year=row.xpath(".//td[1]/text()").get()
            population=row.xpath(".//td[2]/strong/text()").get()
 
            yield{
                'country_name': name,
                'link_only': link_only,
                'year': year,
                'population': population
            }
Reply
#2
Hi,
The Xref you used for the link_only returns None, therefore the entry of the CSV gets blank.
Just change this:
link_only = country.xpath(".//@href/text()").get()
into this:
link_only = country.xpath(".//@href").get()
The object you are getting as response only has the href tag but no text tag and therefore returns None.
Reply
#3
Thank you for your help. It worked.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to create Flask-Stripe Checkout and Charge for the multiple items Antares 4 5,282 Jul-05-2019, 10:20 AM
Last Post: Antares
  Scrape Multiple items from a webpage Prince_Bhatia 2 3,326 Sep-12-2017, 06:08 AM
Last Post: Prince_Bhatia
  Scrapy-cut: Advanced Cookiecutter Scrapy Templating scriptso 2 4,661 Feb-02-2017, 07:57 PM
Last Post: scriptso

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020