Python Forum
Thread Rating:
  • 1 Vote(s) - 2 Average
  • 1
  • 2
  • 3
  • 4
  • 5
getting started, again
#16
(Jul-21-2018, 03:18 AM)bluedoor5 Wrote: Is this now ready to do a web scrape, or do I require anything else ?
Can look at Web-Scraping part-1
Will need Requests and optional lxml(for 3.7 will need wheel from gohlke).
pip install requests  

# Eg lxml wheel for python 32-bit
pip install lxml-4.2.3-cp37-cp37m-win32.whl
Processing c:\aaa\lxml-4.2.3-cp37-cp37m-win32.whl
Installing collected packages: lxml
Successfully installed lxml-4.2.3
Then can test first code just copy into PyScripter and push run button.
import requests
from bs4 import BeautifulSoup
 
url = 'http://CNN.com'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'html.parser')
print(soup.find('title').text)
Output:
CNN International - Breaking News, US News, World News and Video
bluedoor5 Wrote:But there are variations, for example, "over-write" the existing text within, or in some cases append to the same txt file, so it fills up.
Where would I find examples, or are these already in those libraries ?
Reading and writing to files is a standard part of Python.
import requests
from bs4 import BeautifulSoup

url = 'http://CNN.com'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'html.parser')
with open('title.txt', 'a+') as f_out:
    f_out.write(f"{soup.find('title').text}\n")
So if i run code 3 times,title.txt would look like this.
Output:
CNN International - Breaking News, US News, World News and Video CNN International - Breaking News, US News, World News and Video CNN International - Breaking News, US News, World News and Video
Reply


Messages In This Thread
getting started, again - by bluedoor5 - Jul-13-2018, 05:32 AM
RE: getting started, again - by bluedoor5 - Jul-13-2018, 09:06 AM
RE: getting started, again - by snippsat - Jul-13-2018, 10:50 AM
RE: getting started, again - by bluedoor5 - Jul-13-2018, 12:38 PM
RE: getting started, again - by bluedoor5 - Jul-16-2018, 10:00 PM
RE: getting started, again - by snippsat - Jul-16-2018, 10:45 PM
RE: getting started, again - by bluedoor5 - Jul-19-2018, 08:39 PM
RE: getting started, again - by nilamo - Jul-19-2018, 08:48 PM
RE: getting started, again - by bluedoor5 - Jul-19-2018, 09:06 PM
RE: getting started, again - by nilamo - Jul-19-2018, 09:07 PM
RE: getting started, again - by bluedoor5 - Jul-19-2018, 09:44 PM
RE: getting started, again - by bluedoor5 - Jul-20-2018, 10:08 AM
RE: getting started, again - by snippsat - Jul-20-2018, 11:00 AM
RE: getting started, again - by bluedoor5 - Jul-20-2018, 10:47 PM
RE: getting started, again - by bluedoor5 - Jul-21-2018, 03:18 AM
RE: getting started, again - by snippsat - Jul-21-2018, 06:50 AM
RE: getting started, again - by bluedoor5 - Jul-21-2018, 02:44 PM
RE: getting started, again - by snippsat - Jul-21-2018, 03:29 PM
RE: getting started, again - by bluedoor5 - Jul-21-2018, 04:36 PM
RE: getting started, again - by nilamo - Jul-23-2018, 06:00 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  getting started b4iknew 3 2,642 Jan-22-2019, 09:12 AM
Last Post: b4iknew

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020