Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Basic Syntax/HTML Scrape Questions
#6
Wanted to say thank you to everyone. As of this morning I've got a suitable code working.


Here it is....

from bs4 import BeautifulSoup

with open('apple.html') as html_file:
soup = BeautifulSoup(html_file, 'lxml')

for players in soup.find_all('div', class_='player-name'):
hope = players.text
print(hope)

As I go through the learning process (I do apologize, to give everyone an idea, I downloaded PyCharm on Monday and hadn't written or really seen a single line of python or really any code until Tuesday) I want to chronicle some of this issues I've had.
1. So with this code I had a pretty bad hang up trying to use Python rather than the Command line to pip install things. Made a ton of progress once I could pip install bs4 and lxml
2. Issue with the syntax relating to how to import a file into python. (and still a little foggy on how to navigate the path to various files, currently I'm using Shift+Right Click to open PowerShell in the file where 'apple.html' is located. Would love to be able to simply open a command prompt from anywhere and have it specifically refer to this file.
3. Issue with how to pull text only from inside the div class 'player-name' .... the .text portion alluded me for what seemed like quite a while.... Still one issue here, currently the List prints like so:
'Player 1 Name

Player 2 Name

Player 3 Name'
Would like to have it print like this:
'PLayer 1 Name
Player 2 Name
Player 3 Name'

4. I know this one is stupid, but I NNED to remember the ':' at the end of the lines starting with 'with' or 'for' (or 'if' or 'elif') Was stumped there for a little while.

Don't want this to be a novel. But my next goals for this code are to further automate, so that I don't have to save the html into a file by manually right clicking the browser and using Inspect to save the html.
So looking to add lines before what I currently have that will lock onto the active browser, open google inspect, save the html to a default file, and then execute current code....
Later I'd like to add lines below current code to place the player list into a specified Column in an Open Office Sheet.

Today doing more tutorials, will update on progress.

***In reference to point 2 I am currently getting this error message when I run my code in a PyCharm window
FileNotFoundError: [Errno 2] No such file or directory: 'apple.html'
Reply


Messages In This Thread
Basic Syntax/HTML Scrape Questions - by sungar78 - Sep-05-2018, 11:58 PM
RE: Basic Syntax/HTML Scrape Questions - by sungar78 - Sep-06-2018, 09:32 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Trying to scrape data from HTML with no identifiers pythonpaul32 2 875 Dec-02-2023, 03:42 AM
Last Post: pythonpaul32
Lightbulb Python Obstacles | Kung-Fu | Full File HTML Document Scrape and Store it in MariaDB BrandonKastning 5 2,926 Dec-29-2021, 02:26 AM
Last Post: BrandonKastning
  Python Obstacles | Karate | HTML/Scrape Specific Tag and Store it in MariaDB BrandonKastning 8 3,183 Nov-22-2021, 01:38 AM
Last Post: BrandonKastning
  HTML multi select HTML listbox with Flask/Python rfeyer 0 4,654 Mar-14-2021, 12:23 PM
Last Post: rfeyer
  Scrape for html based on url string and output into csv dana 13 5,497 Jan-13-2021, 03:52 PM
Last Post: snippsat
  Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row BrandonKastning 0 2,381 Mar-22-2020, 06:10 AM
Last Post: BrandonKastning
  scrape data 1 go to next page scrape data 2 and so on alkaline3 6 5,216 Mar-13-2020, 07:59 PM
Last Post: alkaline3

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020