Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Download article without photo caption

I am using the newspaper3k to download newspaper articles to .txt files. However, is there any way to only download the actual article, i.e. not the photo captions or links forwarding the reader to other articles? Example: Copy this article without including the text "Emirates and Airbus both said Thursday that the A380 remains highly popular with passengers." which is a caption to the photo? Likewise, not include text that says "related article: xxx" or "Did you read this xxx" which is often in the middle of the article.

YOU can use the library 'beautiful soup', that is covered in the book 'Web scrapping with Python' (Ryan Mitchell).
(Feb-14-2019, 12:37 PM)AlekseyPython Wrote: that is covered in the book 'Web scrapping with Python' (Ryan Mitchell).
We have updated tutorial here,so no reason to buy that book from 2015(which use BeautifulSoup 3(new now is bs4 and also not using Requests).
Web-Scraping part-1
Web-scraping part-2
buran likes this post

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Download multiple large json files at once halcynthis 0 28 Feb-14-2019, 08:41 AM
Last Post: halcynthis
  Import Library but Download first? OscarBoots 15 445 Feb-07-2019, 03:07 AM
Last Post: snippsat
  Download data from webpage after POST request AlDe 0 58 Feb-02-2019, 06:26 AM
Last Post: AlDe
  Add Photo in Python adninqasifa 4 237 Nov-26-2018, 06:06 PM
Last Post: adninqasifa
  Problem with CSV download imtiazu 2 208 Nov-12-2018, 02:03 AM
Last Post: snippsat
  I wan't to Download all .zip Files From A Website (Project AI) eddywinch82 68 3,289 Oct-28-2018, 02:13 PM
Last Post: eddywinch82
  [split] Python Pillow - Photo Manipulation keegan_010 1 200 Oct-11-2018, 09:57 AM
Last Post: Larz60+
  How can read and download the attachment from the mail using while loop in IMAPlib Py Samjith 0 282 Oct-11-2018, 07:15 AM
Last Post: Samjith
  Python Pillow - Photo Manipulation keegan_010 2 315 Oct-11-2018, 03:49 AM
Last Post: keegan_010
  download file from google drive .. evilcode1 7 423 Sep-21-2018, 06:13 PM
Last Post: evilcode1

Forum Jump:

Users browsing this thread: 1 Guest(s)