Extracting the Address tag from multiple HTML files using BeautifulSoup

**buran** · (This post was last modified: Jan-24-2021, 10:17 AM by buran.)

I didn't look at your code into depth. Now I see see it's a bit weird. You iterate over bunch of files, to read first title in separate list, then again to read address(es).

There is no need to use find_all for title - it is expected to have only one tag title, right? Just use soup.find()
Then, is it one address or multiple in each file?
You write to a file after you have exited the second loop. But using list-comprehension will give you only the data from last file not all files (i.e. like when you append to a single list) - this is something I overlooked.
Finally if write 2 lists, but I don't think it will give you what you expect anyway.

import csv
path = "C:\\Users\\mzoljan\\Downloads\\lksd\\"
 
for infile in glob.glob(os.path.join(path, "*.html")):
    with open(infile, "r") as f, open('output2.csv', 'а') as myfile:
        writer = csv.writer(myfile)
        soup = BeautifulSoup(f.read(), 'lxml')
        title = soup.find("title")
        if title:
           title = soup.title.string
        else:
            title = '' # just in case there is no title tag
        address = soup.find_all("address", class_={"styles_address__zrPvy"}) # do you really need find_all?
        for item in address:
            writer.writerow([title, item.string])

Note, the code is not tested as I don't have your html files.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Getting a URL from Amazon using requests-html, or beautifulsoup	aaander	1	1,718	Nov-06-2022, 10:59 PM Last Post: snippsat
	Populating list items to html code and create individualized html code files	ChainyDaisy	0	1,620	Sep-21-2022, 07:18 PM Last Post: ChainyDaisy
	requests-html + Beautifulsoup	klaarnou	0	2,475	Mar-21-2022, 05:31 PM Last Post: klaarnou
	BeautifulSoup Showing none while extracting image url	josephandrew	0	1,969	Sep-20-2021, 11:40 AM Last Post: josephandrew
	HTML multi select HTML listbox with Flask/Python	rfeyer	0	4,735	Mar-14-2021, 12:23 PM Last Post: rfeyer
	Extracting html data using attributes	WiPi	14	5,642	May-04-2020, 02:04 PM Last Post: snippsat
	Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row	BrandonKastning	0	2,407	Mar-22-2020, 06:10 AM Last Post: BrandonKastning
	Web crawler extracting specific text from HTML	lewdow	1	3,451	Jan-03-2020, 11:21 PM Last Post: snippsat
	BeautifulSoup: Error while extracting a value from an HTML table	kawasso	3	3,301	Aug-25-2019, 01:13 AM Last Post: kawasso
	How do I extract specific lines from HTML files before and after a word?	glittergirl	1	5,161	Aug-06-2019, 07:23 AM Last Post: fishhook

Extracting the Address tag from multiple HTML files using BeautifulSoup

User Panel Messages

Announcements