Basic Syntax/HTML Scrape Questions

sungar78 · (This post was last modified: Sep-06-2018, 09:46 PM by sungar78.)

Wanted to say thank you to everyone. As of this morning I've got a suitable code working.

Here it is....

from bs4 import BeautifulSoup

with open('apple.html') as html_file:
soup = BeautifulSoup(html_file, 'lxml')

for players in soup.find_all('div', class_='player-name'):
hope = players.text
print(hope)

As I go through the learning process (I do apologize, to give everyone an idea, I downloaded PyCharm on Monday and hadn't written or really seen a single line of python or really any code until Tuesday) I want to chronicle some of this issues I've had.
1. So with this code I had a pretty bad hang up trying to use Python rather than the Command line to pip install things. Made a ton of progress once I could pip install bs4 and lxml
2. Issue with the syntax relating to how to import a file into python. (and still a little foggy on how to navigate the path to various files, currently I'm using Shift+Right Click to open PowerShell in the file where 'apple.html' is located. Would love to be able to simply open a command prompt from anywhere and have it specifically refer to this file.
3. Issue with how to pull text only from inside the div class 'player-name' .... the .text portion alluded me for what seemed like quite a while.... Still one issue here, currently the List prints like so:
'Player 1 Name

Player 2 Name

Player 3 Name'
Would like to have it print like this:
'PLayer 1 Name
Player 2 Name
Player 3 Name'

4. I know this one is stupid, but I NNED to remember the ':' at the end of the lines starting with 'with' or 'for' (or 'if' or 'elif') Was stumped there for a little while.

Don't want this to be a novel. But my next goals for this code are to further automate, so that I don't have to save the html into a file by manually right clicking the browser and using Inspect to save the html.
So looking to add lines before what I currently have that will lock onto the active browser, open google inspect, save the html to a default file, and then execute current code....
Later I'd like to add lines below current code to place the player list into a specified Column in an Open Office Sheet.

Today doing more tutorials, will update on progress.

***In reference to point 2 I am currently getting this error message when I run my code in a PyCharm window
FileNotFoundError: [Errno 2] No such file or directory: 'apple.html'

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Trying to scrape data from HTML with no identifiers	pythonpaul32	2	875	Dec-02-2023, 03:42 AM Last Post: pythonpaul32
	Python Obstacles \| Kung-Fu \| Full File HTML Document Scrape and Store it in MariaDB	BrandonKastning	5	2,926	Dec-29-2021, 02:26 AM Last Post: BrandonKastning
	Python Obstacles \| Karate \| HTML/Scrape Specific Tag and Store it in MariaDB	BrandonKastning	8	3,183	Nov-22-2021, 01:38 AM Last Post: BrandonKastning
	HTML multi select HTML listbox with Flask/Python	rfeyer	0	4,654	Mar-14-2021, 12:23 PM Last Post: rfeyer
	Scrape for html based on url string and output into csv	dana	13	5,497	Jan-13-2021, 03:52 PM Last Post: snippsat
	Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row	BrandonKastning	0	2,381	Mar-22-2020, 06:10 AM Last Post: BrandonKastning
	scrape data 1 go to next page scrape data 2 and so on	alkaline3	6	5,216	Mar-13-2020, 07:59 PM Last Post: alkaline3

Basic Syntax/HTML Scrape Questions

User Panel Messages

Announcements