Python Forum
Coding problem scraping Goodreads reviews with GoodReadsScraper
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Coding problem scraping Goodreads reviews with GoodReadsScraper
#1
Hello,

I came across OmarEinea's GoodReadsScraper on Github (https://github.com/OmarEinea/GoodReadsScraper) and would like to use his scripts to scrape the English reviews of some English books on Goodreads.

Because I am very new to working with python, I don't have as much insight as I would like into what the different parts of a script mean. Therefore, I would be very grateful if someone with more experience would take a look at the scripts and the steps I took in order to tell me what I did wrong.

I downloaded the ZIP-file containing his scripts, installed the requirements and created a shelf on Goodreads containing the books of which I wanted to scrape the reviews. Because I wanted to scrape English reviews, I changed all istances of "ar" or "arabic" in his scripts to "en" and "english".

The first problem I have is that I do not really understand where I have to add information (and which information) to get what I need. OmarEinea's instruction are very brief and unfortunately do not suffice for me as a layperson to know what I need to change and in which script and which exact place I need to change it.

What I did was:
1) filled in my username and password in Browser.py
2) changed the path to "Users\xxx\Desktop\GoodReadsScraper-master\BookReviews" in Tools.py
3) created Test.py, consisting of the following code (in which "xxx" stands for the id of my shelf containing the books of which I wish to scrape the reviews):
from Books import Books
from Reviews import Reviews
from Tools import *

#Scrape books reviews and write them to a file:

r = Reviews("en")
r.output_books_reviews("xxx")

#Filter Reviews then combine them:

delete_repeated_reviews()
combine_reviews()
However, when I run this via my command line, what I get is a txt-file, named "en1.txt", containing a few reviews from Harry Potter, though these books are not even present in my shelf at all, and an empty folder named "en".
Any advice of what I did wrong or how I can adapt my test-script or OmarEinea's scripts to get what I need?

I think this could be an interesting learning opportunity for me to gain more insight into scripts and scraping and would be very grateful for your advice or help!

Kind regards and I wish you a happy new year,

Ledgreve
Reply


Messages In This Thread
Coding problem scraping Goodreads reviews with GoodReadsScraper - by ledgreve - Jan-06-2020, 08:28 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Strange BS4 Problem While Scraping RSS Feeds digitalmatic7 3 4,251 Feb-15-2018, 03:18 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020