Python Forum
Extract json-ld schema markup data and store in MongoDB
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extract json-ld schema markup data and store in MongoDB
#1
I'am creating a spider to crawl webpage' json-ld schema markup and store data in mongodb. actually I want to scrape json-ld schema markup and extract the data type("@type" : "_____") from schema markup and store this @type in mongodb. My spiders crawl well whole schema markup code. But I want to know that How to extract @type from that json-ld schema markup and store it in mongodb.
This is my spider files

apple_spider.py

import scrapy
from pprint import pprint
from extruct.jsonld import JsonLdExtractor
from ..items import ApplespiderItem

class AppleSpider(scrapy.Spider):
    name = 'apple'
    allowed_domains = ['apple.com']
    start_urls = (
        'http://www.apple.com/shop/mac/mac-accessories',
        )

    def parse(self, response):

        extractor = JsonLdExtractor()

        items = extractor.extract(response.body_as_unicode(), response.url)
        pprint(items)

        for item in items:
            if item.get('properties', {}).get('name'):
                properties = item['properties']

                
                yield {
                    'name': properties['name'],
                    'price': properties['offers']['properties']['price'],
                    'url': properties['url']
                }
items.py

import scrapy


class ApplespiderItem(scrapy.Item):
    # define the fields for your item here like:
    name = scrapy.Field()
    price = scrapy.Field()
    url = scrapy.Field()
pipelines.py

import pymongo

class ApplespiderPipeline(object):

	def __init__(self):
		self.conn = pymongo.MongoClient(
			'localhost',
			27017
		)
		db = self.conn['newdb']
		self.collection = db['app_tb']

	def process_item(self, item, spider):
		self.collection.insert(dict(item))
		return item
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Flask_table module compatibility issue: cannot import name 'Markup' from 'flask' venkateshbalagiri 1 180 Mar-22-2024, 05:07 AM
Last Post: venkateshbalagiri
  Save JSON data to sqlite database on Django Quin 0 2,804 Mar-26-2022, 06:22 PM
Last Post: Quin
  How can users store data temporarily in flask app? darktitan 6 2,859 Mar-21-2022, 06:38 PM
Last Post: darktitan
  Extract data from sports betting sites nestor 3 5,555 Mar-30-2021, 04:37 PM
Last Post: Larz60+
  Retrieve images base64 encoded MongoDB and Flask Nuwan16 2 3,234 Oct-13-2020, 06:25 PM
Last Post: Nuwan16
  How and where to store a data for path tree? zayacEBN 1 1,902 Aug-21-2020, 10:14 PM
Last Post: Larz60+
  Store Screenshot Selenium + MongoDB Nuwan16 9 3,526 Aug-18-2020, 03:57 AM
Last Post: ndc85430
  Extract data from a table Bob_M 3 2,627 Aug-14-2020, 03:36 PM
Last Post: Bob_M
  filtering by category flask+mongodb Leon79 3 7,943 Jul-19-2020, 04:25 AM
Last Post: ndc85430
  error when trying to update mongodb damian0612 6 3,365 Jul-04-2020, 07:25 PM
Last Post: damian0612

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020