Apr-25-2017, 12:57 AM
I rarely have to worry about unicode, especially at the point (character) level.
I'm finding out that I really don't know how to, and can't find a whole lot of help
by searching google (perhaps because I don't know how to formulate my question)
I need to replace certain UTF8 points in my file because Microsoft does not include them
in their UTF8 definition. the self.ms_no_points dictionary causes the error
I'm finding out that I really don't know how to, and can't find a whole lot of help
by searching google (perhaps because I don't know how to formulate my question)
I need to replace certain UTF8 points in my file because Microsoft does not include them
in their UTF8 definition. the self.ms_no_points dictionary causes the error
# from Kebap: May I suggest A I I D Y instead of Á Í Ï Ð Ý class Utf8stuff: def __init__(self, infile_name=None, outfile_name=None): self.infile_name = infile_name self.outfile_name = outfile_name self.ms_no_points = {'\u+081': 'A', '\u+08d': 'I', '\u+08f': 'I', '\u+090':'D', '\u+09d': 'Y'} with open(self.infile_name) as f: self.inbuff = f.readlines() self.process_input() def process_input(self): linecount = 1 for line in self.inbuff: for key, value in self.ms_no_points.items(): if key in line: pos = line.index(key) print('found {} at pos: {} in line {}'.format(key, pos, linecount)) linecount += 1 if __name__ == '__main__': ifile = 'er.sql' ofile = 'erNew.sql' Utf8stuff(infile_name=ifile, outfile_name=ofile)traceback:
Error: File " .../myconv.py", line 9
self.ms_no_points = {'\u+081': 'A', '\u+08d': 'I', '\u+08f': 'I', '\u+090':'D', '\u+09d': 'Y'}
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape
And so how's it done?