Sep-06-2022, 06:54 PM
Thanks.
Turns out Python outputs as Latin1 unless told to use another encoding. It's now displayed OK in an Editor.
For some reason, chardet doesn't detect it as UTF8, though:
C:\Python38-32\Scripts\chardetect.exe output.kml: ISO-8859-1 with confidence 0.683404255319149
Turns out Python outputs as Latin1 unless told to use another encoding. It's now displayed OK in an Editor.
For some reason, chardet doesn't detect it as UTF8, though:
C:\Python38-32\Scripts\chardetect.exe output.kml: ISO-8859-1 with confidence 0.683404255319149
from bs4 import BeautifulSoup import pathlib import os … PATH=pathlib.Path(item).parent EXTENSION = pathlib.Path(item).suffix BASENAME = pathlib.Path(item).stem #Type is <class 'str'> print("Type is ", type(BASENAME)) OUTPUTFILE = f"{BASENAME}.EDITED{EXTENSION}" os.chdir(PATH) soup = BeautifulSoup(open(item, 'r'), 'xml') name = soup.select_one("kml > Document > name") if name: print("Name found") name.string = BASENAME else: print("No name") name = soup.new_tag("name") name.string = BASENAME #get parent, and insert doc = soup.select_one("kml > Document") doc.insert(0,name) #IMPORTANT! with open(OUTPUTFILE, "w",encoding='utf-8') as file: file.write(soup.prettify(formatter=None))