how to detect \x in string so it can be removed - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: how to detect \x in string so it can be removed (/thread-3672.html) |
how to detect \x in string so it can be removed - azimmermann - Jun-12-2017 I have some garbled data that has random \x values in the string. These should be skipped for the import to work correctly I am trying to write code to skip them, but can't figure it out. Sample code is below I can get the code to find 764 in the second index without issue, (it returns a true), but I don't get a true out of the second if statement, when I would like to. How do I deal with the special characters in a case like this? somedata = ['5', '#5\x029.764', '3.768', '3.757', '3.776', '3.787', '3.778', '3.788', '3.760', '3.777', '3.791', '3.792', '3.791', '3.796', '3.798', '3.787', '3.785', '3.802', '3.782'] if any("764" in s for s in somedata): print 'true' #does print true if any(r"\x" in s for s in somedata): print 'true' #does not print true but \x is in somedata RE: how to detect \x in string so it can be removed - wavic - Jun-12-2017 >>> somedata = ['5', '#5\x029.764', '3.768', '3.757', '3.776', '3.787', '3.778', ... '3.788', '3.760', '3.777', '3.791', '3.792', '3.791', '3.796', '3.798', '3. ... 787', '3.785', '3.802', '3.782'] >>> if any(r'\x'): ... print('True') True RE: how to detect \x in string so it can be removed - buran - Jun-12-2017 @azimmermann I don't know what your data are/represent, but \x is escape sequance to denote hex values. At the same time you search for raw r'\x' string. The two are not the same and you cannot search for \x because you will get ValueError: Invalid \x escape @wavix r'\x' will always evaluate True and so will any(r'\x') , because r'\x' is non-empty string
RE: how to detect \x in string so it can be removed - wavic - Jun-12-2017 You are right. I've never used any() before. I just saw the documentation. RE: how to detect \x in string so it can be removed - nilamo - Jul-05-2017 (Jun-12-2017, 10:08 PM)wavic Wrote: I've never used any() before. I just saw the documentation.any() checks if any of the values are True, and returns the first time it spots a true value (that's important, because if you pass it a generator, it won't cause the whole sequence to be iterated over). It's basically this: def any(items): for item in items: if item: return True return False RE: how to detect \x in string so it can be removed - DeaD_EyE - Jul-05-2017 The sequence #5\x029.764 represents: #5__STX__9.764 .print '#5\x029.764' You should read this: https://docs.python.org/2.0/ref/strings.html'\x029' is not octal, so Python interprets '\x02', which is hexadecimal representation.Here you'll find all ASCII Codes: http://www.asciitable.com/ If you use the raw string instead, Python won't interpret the escape sequences: print r'#5\x029.764' I guess you want to filter non printable sequences.import string new_list = [filter(lambda e: e in string.printable, st) for st in somedata] # or as generator expression filter_non_printable = (filter(lambda e: e in string.printable, st) for st in somedata)If you want to clean your data from all, except digits and the decimal point: import string import pprint allowed_chars = string.digits + '.' new_list = [filter(lambda e: e in allowed_chars, st) for st in somedata] pprint.pprint(new_list)Output: By the way, use Python 3.x. It's much cleaner code and makes more fun. |