Python Forum
Detect comments part using web scraping
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Detect comments part using web scraping
#1
Hi
I'm following this tutorial to scrape Ajax comments section from a website https://likegeeks.com/python-web-scraping/

I'm using both beautifulSoup and selenium to scrape Ajax comments.
The article shows how to scrape ajax content, but I'm asking:
Is there a way to detect that this section is the content section regardless the scraped website? like an advanced scraping library or so?

Thanks in advance.
Reply
#2
(Jan-18-2018, 08:06 PM)seco Wrote: Is there a way to detect that this section is the content section regardless the scraped website?
well, it always depend on website structure. There is no "universal" solution that fit all cases
Reply
#3
What about detecting it using AI libraries if there are any?
Reply
#4
Why do you want the comments? There's rarely anything useful in them.

And there might be a library to scrape the contents. You can brute force it by just iterating over all nodes and getting the text content, but then you'll end up with a lot of extra stuff like navigation and copyright info. Saying "it's not possible" doesn't make sense, as every search engine has been doing it for years, lol
Reply
#5
(Jan-18-2018, 09:33 PM)nilamo Wrote: Saying "it's not possible" doesn't make sense, as every search engine has been doing it for years, lol
I didn't say it's not possible, but OP asks for advanced scraping library - that I understand as advanced python library that out of the box support extracting comments from ANY website. What you refer to is much more in the domain of AI and ML than webscrapping. Pardon my skepticism if OP is on the verge of developing the next GOOGLE-killer search engine.
Reply
#6
I need to identify links in comments sections for a SEO purpose that's all.
Because I don't know the effectiveness of a link if it's on a comment compared with it on the body of the text.
Reply
#7
I think you should show an example page of what you're scraping. Html comments are ignored by search engines, and thus don't matter for seo reasons.
Reply
#8
Who told that comments are ignored by search engines?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Two part scraping? never5000 1 1,261 Feb-23-2022, 03:49 PM
Last Post: snippsat
  Instagram Bot _ Posting Comments kristianpython 3 3,273 May-23-2020, 12:54 PM
Last Post: kristianpython
  Post comments to Wordpress Blog SergeyLV 1 2,423 Aug-01-2019, 01:38 AM
Last Post: Larz60+
  Form add tree comments with mptt m0ntecr1st0 1 2,478 Feb-23-2019, 01:50 PM
Last Post: m0ntecr1st0
  Questions abou Web-scraping part-2 Tutorial ljmetzger 2 2,732 Mar-25-2018, 09:14 PM
Last Post: ljmetzger
  BS4 Not Able To Find Text In CSS Comments digitalmatic7 4 5,175 Feb-27-2018, 03:45 AM
Last Post: digitalmatic7
  Need comments on content/style of my first project league55 2 2,962 Jan-24-2018, 08:20 AM
Last Post: league55

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020