Hello S,
so all my html files have to be UTF-8
The files are just saved from a word document as Web Page html.
I will do some testing
thanks for the help
Hello S,
I have been testing on the html files you gave.
There is no error with them.
So I have to encode all my html files as UTF-8
The final part to save my images - I get an error
TypeError: string indices must be integers
My file paths are ok the code looks ok but this error - i dont know
This is the html
so all my html files have to be UTF-8
The files are just saved from a word document as Web Page html.
I will do some testing
thanks for the help
Hello S,
I have been testing on the html files you gave.
There is no error with them.
So I have to encode all my html files as UTF-8
The final part to save my images - I get an error
import os, os.path from PIL import Image from bs4 import BeautifulSoup as bs path = 'c:/Users/Dan/Desktop/a/' for root, dirs, files in os.walk(path): for f in files: soup = bs(open(os.path.join(root, f)), 'lxml') for image in soup.find_all("img"): image = image.get('src') im = Image.open(os.path.join(root, image["src"])) im.save(path+image["src"], "png") print(image)im = Image.open(os.path.join(root, image["src"]))
TypeError: string indices must be integers
My file paths are ok the code looks ok but this error - i dont know
This is the html
<!DOCTYPE html> <html> <body> <h2>HTML Image</h2> <img src="images/image002.jpg" alt="Flowers in Chania" width="460" height="345"> </body> </html>
:)
Python newbie trying to learn the ropes