Python Forum
Good book on Web scraping and crawling
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Good book on Web scraping and crawling
#1
Hi All,

Could you suggest me a good ,standard, latest version book on Python web scraping and crawling

Thanks,
Surya
Reply
#2
You dont really need a book. You can use free online sources.

we have a tutorial
https://python-forum.io/Thread-Web-Scraping-part-1

I would say the standard practice nowadays is to use the requests module with BeautifulSoup.
as well as some basic knowledge of HTML so you know what you are parsing/crawling. Instead of BeautifulSoup you could rather use lxml. A little bit of Javascript wouldnt hurt as well as using selenium to bypass site with Javascript.

Most of your googling would most likely reside of how to catch X tag with BeautifulSoup.

NOTE:i havent read this book. I only scanned through it real quick, but it describes BeautifulSoup/Scrapy/API's/Selenium/Xpaths/Image Processing/Text Recognition/bot traps/etc.. The only thing it does is use urllib instead of the requests library. But you can find the same info online if you search for the stuff.  
http://shop.oreilly.com/product/0636920034391.do

EDIT:
https://nocodewebscraping.com/top-10-web...ing-books/
Recommended Tutorials:
Reply
#3
There is also a second part to snippsat's tutorial https://python-forum.io/Thread-Web-scraping-part-2
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Unicode letters in crawling page DMDoniz 5 2,652 Oct-31-2020, 07:03 AM
Last Post: buran
  Web Crawler's Crawling Ability samlee916 3 2,696 Aug-10-2020, 12:50 PM
Last Post: abusalim
  Web Scraping and crawling venkataramakrishna 1 1,816 Jan-25-2020, 06:07 PM
Last Post: Larz60+
  Crawling tweets with scrapy R3turnz 1 4,484 Jan-16-2017, 06:14 PM
Last Post: micseydel

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020