Sep-09-2021, 07:03 PM
(Sep-09-2021, 12:40 PM)muzikman Wrote: I'm assuming, dataset is a SQL package?It's been built on top of SQLAlchemy, so then will dataset works with all major databases, such as SQLite, PostgreSQL and MySQL.
The advantage is of course the simple usage,so it's good for smaller project.
Can use your code as example with some modification to make database using build in sqlite3 .
import requests import bs4 import dataset def soup_request(page_num, base_url): res = requests.get(base_url.format(page_num)) return bs4.BeautifulSoup(res.content, 'lxml') page_num = 1 page_count = 20 base_url = 'https://books.toscrape.com/catalogue/page-{}.html' book_list = [] while page_count * page_num <= 1000: soup = soup_request(page_num, base_url) books = soup.select('.product_pod') for book in books: rating = book.p['class'][1] title = book.h3.a['title'] book_list.append((('Title', title), ('Rating',rating))) page_num += 1 # To DB books = [] for book in book_list: books.append(dict((y, x) for y, x in book)) db = dataset.connect('sqlite:///books.db') table = db['book_table'] for book_info in books: table.insert(book_info)So not much code and now have fully functional database
Usage test.
>>> Sapiens = table.find_one(Title='Sapiens: A Brief History of Humankind') >>> Sapiens OrderedDict([('id', 5), ('Title', 'Sapiens: A Brief History of Humankind'), ('Rating', 'Five')]) >>> Sapiens.get('Rating') 'Five'Have also full power of SQL queries,so how many books has
History
in Title.>>> [b for b in db.query("SELECT * from book_table WHERE (lower(title) LIKE '%History%');")] [OrderedDict([('id', 5), ('Title', 'Sapiens: A Brief History of Humankind'), ('Rating', 'Five')]), OrderedDict([('id', 60), ('Title', 'The Natural History of Us (The Fine Art of Pretending #2)'), ('Rating', 'Three')]), OrderedDict([('id', 134), ('Title', 'Thomas Jefferson and the Tripoli Pirates: The Forgotten War ' 'That Changed American History'), ('Rating', 'One')]), OrderedDict([('id', 147), ('Title', "The Omnivore's Dilemma: A Natural History of Four Meals"), ('Rating', 'Two')]), OrderedDict([('id', 303), ('Title', 'Greek Mythic History'), ('Rating', 'Five')]), OrderedDict([('id', 347), ('Title', "A People's History of the United States"), ('Rating', 'Two')]), OrderedDict([('id', 464), ('Title', 'Please Kill Me: The Uncensored Oral History of Punk'), ('Rating', 'Four')]), OrderedDict([('id', 480), ('Title', 'History of Beauty'), ('Rating', 'Four')]), OrderedDict([('id', 486), ('Title', 'Brilliant Beacons: A History of the American Lighthouse'), ('Rating', 'Three')]), OrderedDict([('id', 544), ('Title', 'A Short History of Nearly Everything'), ('Rating', 'Five')]), OrderedDict([('id', 547), ('Title', 'The Rise and Fall of the Third Reich: A History of Nazi ' 'Germany'), ('Rating', 'Two')]), OrderedDict([('id', 640), ('Title', "America's War for the Greater Middle East: A Military History"), ('Rating', 'Two')]), OrderedDict([('id', 691), ('Title', 'A History of God: The 4,000-Year Quest of Judaism, ' 'Christianity, and Islam'), ('Rating', 'One')]), OrderedDict([('id', 693), ('Title', 'Zero History (Blue Ant #3)'), ('Rating', 'One')]), OrderedDict([('id', 695), ('Title', 'World War Z: An Oral History of the Zombie War'), ('Rating', 'One')]), OrderedDict([('id', 757), ('Title', 'The Disappearing Spoon: And Other True Tales of Madness, Love, ' 'and the History of the World from the Periodic Table of the ' 'Elements'), ('Rating', 'Five')])]