Beautiful soup and tags

***snippsat*** · (This post was last modified: Jul-08-2019, 02:15 PM by snippsat.)

(Jul-08-2019, 12:16 PM)starter_student Wrote: and now there is no error but the output file is empty just with headers

That's because your parsing or something else is wrong.
Do test is small step,put in print() and do test in REPL.
store_details = {} should be outside of the loop.

The html code you posted it's just a mess.
To show how can test html code outside of a web-site.

from bs4 import BeautifulSoup as soup
import csv
import requests

html = '''\
<div id="storelist">
  <ul>
    <li>Coffee</li>
    <li>Tea</li>
    <li>Milk</li>
  </ul>
</div>'''

#code
from bs4 import BeautifulSoup as soup
import csv
import requests

#URL = "http:www.abc.com"
#r = requests.get(URL)
soup = BeautifulSoup(html, 'lxml')
table = soup.find('div', id="storelist")
print(table) # Test print
store_details = {}
for row in table.find_all('li'):
    store_details[row.text] = f'<{row.text}> parsed for site'

filename = 'store_details_tab.csv'
with open(filename, 'w') as f:
    w = csv.DictWriter(f, ['Coffee', 'Tea', 'Milk'])
    w.writeheader()
    w.writerow(store_details)

In csv:

Output:Coffee,Tea,Milk

<Coffee> parsed for site,<Tea> parsed for site,<Milk> parsed for site

starter_student · Jul-08-2019, 03:41 PM

(Jul-08-2019, 02:15 PM)snippsat Wrote:
(Jul-08-2019, 12:16 PM)starter_student Wrote: and now there is no error but the output file is empty just with headers
That's because your parsing or something else is wrong.
Do test is small step,put in print() and do test in REPL.
store_details = {} should be outside of the loop.

The html code you posted it's just a mess.
To show how can test html code outside of a web-site.
from bs4 import BeautifulSoup as soup
import csv
import requests

html = '''\
<div id="storelist">
  <ul>
    <li>Coffee</li>
    <li>Tea</li>
    <li>Milk</li>
  </ul>
</div>'''

#code
from bs4 import BeautifulSoup as soup
import csv
import requests

#URL = "http:www.abc.com"
#r = requests.get(URL)
soup = BeautifulSoup(html, 'lxml')
table = soup.find('div', id="storelist")
print(table) # Test print
store_details = {}
for row in table.find_all('li'):
    store_details[row.text] = f'<{row.text}> parsed for site'

filename = 'store_details_tab.csv'
with open(filename, 'w') as f:
    w = csv.DictWriter(f, ['Coffee', 'Tea', 'Milk'])
    w.writeheader()
    w.writerow(store_details)
In csv:
Output:Coffee,Tea,Milk

<Coffee> parsed for site,<Tea> parsed for site,<Milk> parsed for site

Thanks for this approach ... it helped me to understand some stuffs. The html code was just a sample ... here is the right structure with a nested div

[html]
<div id ="storelist" class>
<ul>
<li id ="00021455" class>
<div class ="wr-store-details">
<p> name </p>
<span class ="address dc">Street 2</span>
<span class ="city">LA</span>
</div>
</li>
<li>
</li>
.
.
.
</ul>
</div>
[/html]

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Soup('A')	new_coder_231013	5	7,375	Oct-13-2024, 06:19 AM Last Post: Pedroski55
	Beautiful Soup - access a rating value in a class	KatMac	1	4,193	Apr-16-2021, 01:27 PM Last Post: snippsat
	Beginner web scraping/Beautiful Soup help	7ken8	2	3,639	Jan-28-2021, 04:26 PM Last Post: 7ken8
	Help: Beautiful Soup - Parsing HTML table	ironfelix717	2	3,593	Oct-01-2020, 02:19 PM Last Post: snippsat
	Beautiful Soup (suddenly) doesn't get full webpage html	j.crater	8	27,816	Jul-11-2020, 04:31 PM Last Post: j.crater
	Requests-HTML vs Beautiful Soup - How to Choose?	robin73	0	4,341	Jun-23-2020, 02:53 PM Last Post: robin73
	looking for direction - scrappy, crawler, beautiful soup	Sly_Corn	2	3,191	Mar-17-2020, 03:17 PM Last Post: Sly_Corn
	Beautiful soup truncates results	jonesjoz	4	5,167	Mar-09-2020, 06:04 PM Last Post: jonesjoz
	Beautiful Soup find_all()	kirito85	2	4,118	Jun-14-2019, 02:17 AM Last Post: kirito85
	[split] Using beautiful soup to get html attribute value	moski	6	7,560	Jun-03-2019, 04:24 PM Last Post: moski

Beautiful soup and tags

User Panel Messages

Announcements