Python Forum

Full Version: How to crawl schema markup data type using scrapy?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I tried to crawl and extract a data type of schema markup using python using this

response.xpath("//script[@type='application/ld+json']/text()").extract()
but this extract a whole schema markup code. Is there a any method to crawl and extract only the data type of schema markup?
You can use the json module to convert the string you have to a dict, and then just access the data you want.
Another option is using extruct, which will extract the schema information automatically.