Posts: 4,647
Threads: 1,494
Joined: Sep 2016
in this:
Output: >>> repr('\n')
"'\\n'"
>>> repr('\b')
"'\\x08'"
>>>
you can used either backslash escape in source code but repr() does not produce both, just newline. is there a way i can reverse codes back their backslash forms for every backslash escape that is supported in Python source code?
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Posts: 4,790
Threads: 76
Joined: Jan 2018
(Jun-05-2022, 12:02 AM)Skaperen Wrote: is there a way i can reverse codes back their backslash forms for every backslash escape that is supported in Python source code? You can replace \x08 in the repr by \b and similarly for the other escaped characters.
Posts: 4,647
Threads: 1,494
Joined: Sep 2016
(Jun-05-2022, 06:05 AM)Gribouillis Wrote: You can replace \x08 in the repr by \b and similarly for the other escaped characters.
i was thinking of making my own conversion but this is a better idea.
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Posts: 4,647
Threads: 1,494
Joined: Sep 2016
(Jun-05-2022, 06:05 AM)Gribouillis Wrote: You can replace \x08 in the repr by \b and similarly for the other escaped characters. i will need to use raw strings or escape the \ to make .replace() calls.
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Posts: 4,790
Threads: 76
Joined: Jan 2018
Jun-06-2022, 05:29 AM
(This post was last modified: Jun-06-2022, 05:53 AM by Gribouillis.)
I would use something like (untested)
mapping = {
repr('\b'): r'\b',
repr('\n'): r'\n',
...
}
def func(match):
s = match.group(0)
return mapping.get(s, s)
result = re.sub(r'\\(?:\\|x[a-f0-9]{2})', func, text)
Posts: 4,647
Threads: 1,494
Joined: Sep 2016
here is what i initially created:
def unctrl(s):
"""Control characters (in str or bytes or bytearray) become backslash equivalents."""
if isinstance(s,str):
return repr(s)[1:-1].replace('\\x00','\\0').replace('\\x07','\\a').replace('\\x08','\\b').replace('\\x09','\\t').replace('\\x0a','\\n').replace('\\x0b','\\v').replace('\\x0c','\\f').replace('\\x0d','\\r')
elif isinstance(s,(bytes,bytearray)):
return repr(s)[1:-1].replace(b'\\x00',b'\\0'),replace(b'\\x07',b'\\a').replace(b'\\x08',b'\\b').replace(b'\\x09',b'\\t').replace(b'\\x0a',b'\\n').replace(b'\\x0b',b'\\v').replace(b'\\x0c',b'\\f').replace(b'\\x0d',b'\\r')
else:
raise TypeError(f'unctrl(): argument needs to be str or bytes or bytearray, not {s!r}') if i need to just start over, tell me. note that i am supporting bytes because of support for files that can have bytes that are invalid as UTF-8.
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Posts: 4,790
Threads: 76
Joined: Jan 2018
Jun-08-2022, 06:29 AM
(This post was last modified: Jun-08-2022, 06:29 AM by Gribouillis.)
(Jun-08-2022, 05:56 AM)Skaperen Wrote: if i need to just start over, tell me What if the repr is "\\\\x07" ? or '\\\\\\x07'
Posts: 4,647
Threads: 1,494
Joined: Sep 2016
(Jun-08-2022, 06:29 AM)Gribouillis Wrote: What if the repr is "\\\\x07" ? or '\\\\\\x07' good point to think about. thanks!!
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Posts: 4,647
Threads: 1,494
Joined: Sep 2016
(Jun-23-2022, 08:08 PM)ervinjason Wrote: I believe you can replace \x08 in the repr by \b and similarly for the other escaped characters. but it might originally be a couple \ followed by plain text 'x08'. you'd see "\\\\x08". you don't want to replace the last 4 with '\b' pretending that was an original. even if you do "\\\\" -> '\\' first, it would still really be "\\\\x08" -> '\\x08' and then if "\08" -> '\b' is applied, you now have '\\b'. how would that get to the real original? stateful context means something. this depends on how many \ precede it since \ can escape \ instead of only allowing '\x05c'. maybe you might have '\\x05c' some day.
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
|