Dec-30-2021, 04:29 AM
Riding my bicycle to work this beautiful but cold morning, I thought of another way to do this.
Just out of interest, no modules needed, if you have the html.
Just out of interest, no modules needed, if you have the html.
def myApp(): # get the html somehow first, then open it path2text = '/home/pedro/temp/lovely_louise.html' with open(path2text) as f: lines = f.readlines() print('lines is', len(lines), 'long') # get lines with '<img src="' because they contain pictures # put these lines in data data = [] for line in lines: if '<img src="' in line: data.append(line) print('data is', len(data), 'long') # have a look at the data for d in data: print(d) jpg_names = [] # split each line on img src= # you get the list splitline # the second element of the list splitline, splitline[1], contains the name of the picture file for line in data: print(line) splitline = line.split('img src=') pic_data = splitline[1] pic_datalist = pic_data.split() name = pic_datalist[0] # maybe the picture file name is enclosed in ' ' otherwise by " ", get rid of them # maybe there is some leading or trailing space in the html # before or after the file name filename = name.replace('"', '').replace('\'', '').replace(' ', '') jpg_names.append(filename) print('jpg_names is', len(jpg_names), 'long') for j in jpg_names: print(j) savename = '/home/pedro/temp/photo_names.txt' with open(savename, 'w') as f: text = '\n'.join(jpg_names) f.write(text) print('All done!')