reading html and edit chekcbox to html - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: reading html and edit chekcbox to html (/thread-34142.html) |
reading html and edit chekcbox to html - jacklee26 - Jun-30-2021 I have a question about reading HTML files, and edit by adding a checkbox before a tag on every link. I have a test.html that looks like this: Quote:<a href="https://www.google.com">Link </a><br><a href="https://www.youtube.com">Link </a><br><a href="https://www.instagram.com">Link </a><br> i wish my output will be like this, add a checkbox before each link which should look like this <input type="checkbox"> <a href="https://www.google.com">Link </a><br> <input type="checkbox"> <a href="https://www.youtube.com">Link </a> <input type="checkbox"> <a href="https://www.instagram.com">Link </a>Do anyone have any idea i try like this but seem not working lines = [] #open file with open(r'test.html', mode='r') as f: for line in f.readlines(): # iterate thru the lines if '<br>' in line: text = '<input type="checkbox">' lines.append(text) lines.append(line) #write to a new file with open(test.html', mode='w') as new_f: new_f.writelines(lines)I think I need to add a new line after <br> else it won't work. Does anyone know how to do solve it? Step: Read test.html edit the test.html adding <checkkbox> before a tag RE: reading html and edit chekcbox to html - snippsat - Jun-30-2021 I would do like this also remove <br> tag so it's clean and it's better to write CSS for new line.Could also to this in a parser eg BeautifulSoup,but adding like simple whiteout. with open(r'test.html') as f, open('out.html', 'w') as f_out: for line in f: line = line.replace('<br>', '') #print(f'<input type="checkbox"> {line}') f_out.write(f'<input type="checkbox"> {line}') Example with CSS CodePenMaybe also add a <ul> Tag for better CSS.
RE: reading html and edit chekcbox to html - jacklee26 - Jul-01-2021 (Jun-30-2021, 04:51 PM)snippsat Wrote: I would do like this also remove why I run will occur UnicodeDecodeError: 'cp950' codec can't decode byte 0xbf in position 2: illegal multibyte sequence I try to add with open(r'test.html',encoding="utf-8") as f, open('out123.html', 'w',encoding="utf-8") as f_out: for line in f: print(line) line = line.replace('<br>', '') print(f'<input type="checkbox"> {line}') f_out.write(f'<input type="checkbox"> {line}')but the output look like this: it still not add checkbox in front of the link, only the first one. <input type="checkbox"> <a href="https://www.google.com">Link </a><a href="https://www.youtube.com">Link </a><a href="https://www.instagram.com">Link </a>Thanks RE: reading html and edit chekcbox to html - snippsat - Jul-01-2021 (Jul-01-2021, 04:11 AM)jacklee26 Wrote: t still not add checkbox in front of the link, only the first one.Ok i thought that your input was on new line,see now that input is one big line. If you make this file could probably fix it when save it,and also save it as utf-8. Then it will like this. with open(r'test.html', encoding='utf-8') as f, open('out.html', 'w', encoding='utf-8') as f_out: content = f.read().split('<br>') for line in content[:-1]: #print(f'<input type="checkbox"> {line}') f_out.write(f'<input type="checkbox"> {line}\n') With <ul> tag as i talked about.with open(r'test.html', encoding='utf-8') as f, open('out.html', 'w', encoding='utf-8') as f_out: content = f.read().split('<br>') f_out.write(f'<ul id="links">\n') for line in content[:-1]: f_out.write(f'<input type="checkbox"> {line}\n') f_out.write(f'</ul>\n')CSS CodePen RE: reading html and edit chekcbox to html - jacklee26 - Jul-01-2021 (Jul-01-2021, 08:43 AM)snippsat Wrote:(Jul-01-2021, 04:11 AM)jacklee26 Wrote: t still not add checkbox in front of the link, only the first one.Ok i thought that your input was on new line,see now that input is one big line. thanks it can work but i have a question why my code does work for this one, only the first line have check box, the rest don't have. Just doesn't know why with open('test.html', 'r',encoding="utf-8") as file: # read a list of lines into data data = file.readlines() for i in data: line = i.replace('<br>', '<br> \n') with open('test_out.html', 'w',encoding="utf-8") as file: file.write(f'<input type="checkbox"> {line}') #file.write(line)<input type="checkbox"> <a href="https://www.google.com">Link </a><br> <a href="https://www.youtube.com">Link </a><br> <a href="https://www.instagram.com">Link </a><br> RE: reading html and edit chekcbox to html - snippsat - Jul-01-2021 (Jul-01-2021, 09:50 AM)jacklee26 Wrote: but i have a question why my code does work for this one, one the first line have check box, the rest don't have. Just doesn't know whyBecause is still one line doing it like this,have to split on new line then a new loop. .readlines() is pretty much never needed,see that as i done in previous posts directly loop over file object.with open('test.html', 'r' ,encoding="utf-8") as file, open('test_out.html', 'w', encoding="utf-8") as f_out: for line in file: line = line.replace('<br>', '<br>\n').split('\n') for item in line[:-1]: f_out.write(f'<input type="checkbox"> {item}\n')
|