Python Forum
How to crawl schema markup data type using scrapy?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to crawl schema markup data type using scrapy?
#1
I tried to crawl and extract a data type of schema markup using python using this

response.xpath("//script[@type='application/ld+json']/text()").extract()
but this extract a whole schema markup code. Is there a any method to crawl and extract only the data type of schema markup?
Reply
#2
You can use the json module to convert the string you have to a dict, and then just access the data you want.
Another option is using extruct, which will extract the schema information automatically.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Flask_table module compatibility issue: cannot import name 'Markup' from 'flask' venkateshbalagiri 1 331 Mar-22-2024, 05:07 AM
Last Post: venkateshbalagiri
  Scrapy does not show all data in iteration loop georgekasa 0 1,981 Jul-31-2021, 09:10 AM
Last Post: georgekasa
  Extract json-ld schema markup data and store in MongoDB Nuwan16 0 2,454 Apr-05-2020, 04:06 PM
Last Post: Nuwan16
  No data when using scrapy to get data ADBYITMS 3 2,736 Nov-11-2019, 03:05 PM
Last Post: stranac
  How to use BeautifulSoup4 with pandas series type of html data? PrateekG 4 4,898 Apr-26-2018, 07:33 AM
Last Post: PrateekG
  Scrapy-cut: Advanced Cookiecutter Scrapy Templating scriptso 2 4,650 Feb-02-2017, 07:57 PM
Last Post: scriptso

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020