Python Forum
Add elements to a Dictionary
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Add elements to a Dictionary
#11
I read over the documentation to the dataset package. I would have to use pure SQL to insert relational data. For instance:

I have a table called Movie and another one called People and Role and join tables.

So, I would have a table structure like:

Movie > Movie_Person > Person
In more powerful ORMs you can relate these tablets that just take a collection of Movies, People and insert them with one insert. It will automatically add the corresponding data in the join table. I used Laravel the PHP framework and their ORM was awesome. I used it to harvest data from a few movie restful APIs. The response was all JSON and I would turn the JSON into PHP objects, then use the ORM to insert A Movie and the entire cast of people with one call, over three tables. The framework is now way over bloated and is basically another copy of Symfony framework.

Scraping is one thing but a RESTful API does store the data relationally.

Dataset sounds like it would be good with a NO SQL DB such as Mongo, etc..
I apologize for my quotes, including all of the previous code. I will stop doing that. I didn't even realize it.

Dataset might actually be good for RESTful APIs because the API is relational. You grab the movie object and the rest is underneath that.

For scraping, if I grab Movies, I will have an PrimaryKey ID on insert. However, if I were to scrap the cast for that movie, I would have to insert them in the Person table, then relate them to the movie with a join table. This would all be one process.

If I scraped the movie data one day. Then tried to associate the cast to that movie another day, that would be difficult to do.
With an RESTful API, even if I decide to insert the movie data one day. Then cast another day, I still have the movie ID as the primary key in the Movie table. So, I would just run through the api with my movie id and grab what I need.

Anyway..... Just my two cents.


Scraping has it's own purpose. Just my two cents.
I have a couple of questions in the comments of your code below:

Quote:Can use your code as example with some modification to make database using build in sqlite3 .
import requests
import bs4
import dataset

def soup_request(page_num, base_url):
    res = requests.get(base_url.format(page_num))
    return bs4.BeautifulSoup(res.content, 'lxml')

page_num = 1
page_count = 20
base_url = 'https://books.toscrape.com/catalogue/page-{}.html'
book_list = []
while page_count * page_num <= 1000:
    soup = soup_request(page_num, base_url)
    books = soup.select('.product_pod')
    for book in books:
        rating = book.p['class'][1]
        title = book.h3.a['title']
        book_list.append((('Title', title), ('Rating',rating))) #You're appending a tuple but each tuple has an identifier 'Title' and 'Rating'.
                                                                                #Does this associate the key with the value here? My solution wouldn't have had a key reference.
    page_num += 1                                                     # I like your solution better

# To DB
books = []
for book in book_list:
    books.append(dict((y, x) for y, x in book))  #Is this tuple comprehension converted to a dictionary object to unpack x,y? 
db = dataset.connect('sqlite:///books.db')
table = db['book_table']
for book_info in books:
    table.insert(book_info)
Reply
#12
muzikman Wrote:You're appending a tuple but each tuple has an identifier 'Title' and 'Rating'.
Does this associate the key with the value here? My solution wouldn't have had a key reference
It's make the tuple easier to convert to a dictionary.
So this is the convert part,see that in Python easy take out small part and do test interactively.
>>> t = (('Title', '1,000 Places to See Before You Die'), ('Rating', 'Five'))
>>> dict((y, x) for y, x in t)
{'Title': '1,000 Places to See Before You Die', 'Rating': 'Five'}
>>> d['Rating']
'Five'
>>> d['Title']
'1,000 Places to See Before You Die'
Quote:In more powerful ORMs you can relate these tablets that just take a collection of Movies, People and insert them with one insert. It will automatically add the corresponding data in the join table. I used Laravel the PHP framework and their ORM was awesome. I used it to harvest data from a few movie restful APIs. The response was all JSON and I would turn the JSON into PHP objects, then use the ORM to insert A Movie and the entire cast of people with one call, over three tables. The framework is now way over bloated and is basically another copy of Symfony framework.
If doing this on web with Python i would use Flask with Flask-SQLAlchemy
Django is ok if want a larger web-framework with most stuff build in.
FastAPI is the new start ✨
Reply
#13
Yes, I get it. Makes sense.

I'm not sure where I am headed with Python. The web is the smoothest transition. I've never written a GUI in my life. But I will try to at the end of this course. I will take a course on Flask after I am done with the current one.

It's been a while since I've dealt with a stateless environment. So, I have a quick question:

When using classes and creating instances on the web, do I have to change the name of the instance for each visitor, like using a session id GUID? I forget this stuff.
Actually, I just remembered that objects are passed by reference. So, if I am not mistaken, separate visitors can have the same instance name because the memory address is different.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  ValueError: Length mismatch: Expected axis has 8 elements, new values have 1 elements ilknurg 1 5,013 May-17-2022, 11:38 AM
Last Post: Larz60+
  Sorting Elements via parameters pointing to those elements. rpalmer 3 2,550 Feb-10-2021, 04:53 PM
Last Post: rpalmer
  Looping through dictionary and comparing values with elements of a separate list. Mr_Keystrokes 5 3,835 Jun-22-2018, 03:08 PM
Last Post: wavic

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020