Python Forum
Additional slashes being added to string
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Additional slashes being added to string
#1
I have the following code, where I have the variable domain_with_escapes the way I want it. When I add it as part of a value in a dictionary, there are additional slashes that are getting added, and I can't figure out why.

#!/usr/local/bin/python3.4

import re

domain_with_escapes='aw\.me\.org'
print("Domain with escapes is: " + str(domain_with_escapes) + '\r\r')
d0 = "hw"
s0 = "site"
d1 = "cmc"
s1 = "sitename"

key = d0 + " in " + s0
match_string = '.*' + str(d1) + '.*' + str(s1) + '.*' + domain_with_escapes
print("match_string is: " + str(match_string) + '\r\r' )

params_for = dict()
params_for[key] = {'name~' :  str(match_string) }
print("Dict params_for is: " + str(params_for) + '\r\r')

quit()
When I run it, I get this:

$ ./test.py
Domain with escapes is: aw\.me\.org
match_string is: .*cmc.*sitename.*aw\.me\.org
Dict params_for is: {'hw in site': {'name~': '.*cmc.*sitename.*aw\\.me\\.org'}}
Any thoughts on why the additional slashes are getting added?
Reply
#2
use an r before the string like:

domain_with_escapes= r'aw\.me\.org'
print(domain_with_escapes)
result:
Output:
aw\.me\.org
if you're using an interactive python, it may appear that there are two slashes,
but when writing to a file, or printing, they are just escaped,
Reply
#3
OK, thanks. So I take it that in certain circumstances, i.e. if it's printed or writing to a file, there may appear two slashes, but there is really only one in memory?

Most of all, I'm trying to figure out how this would differ from the equivalent string the following dictionary:

params_for = {
    "hw1 in location1" : {'name~' : '.*cmc.*site1.*\.aw\.me\.org'},
    "hw2 in location2" : {'name~' : '.*drc.*site1.*\.aw\.me\.org'},
}
I ask because when I pass this data structure for the Infoblox API to search, I get the expected results, i.e. I see data I expect to see. Moreover, I didn't have to set this as a raw string, and it still returned the desired results. However, when I use the data structure as earlier (the only difference being the double slashes), I see no records returned. So there must be something about how '.*cmc.*site1.*\.aw\.me\.org' differs from the contents of the match_string variable in my code earlier.
Reply
#4
I'm not sure what the full rule is, but it does show the 'escape' sometimes.
I never really investigated it,or thought about it in depth because by nature
(probably so many years programming) that I just seem to get ir right
Reply
#5
So, anybody know what the full rule is?

It seems that whenever the string is assigned as an element in a dictionary, python adds an extra slash to '\.' to make '\\.'. I know this can't be just a representation of the hash, as the Infoblox API is seeing it differently, meaning that python is passing it different input as a request.

I changed the code to the following, and still get the same thing, although I don't see the same behavior when assigning to a variable, as opposed to an element in a dictionary. Anything special about assigning to a dictionary that would trigger the additional slash to be added?

domain_with_escapes=r'aw\.me.org'
print("Domain with escapes is: " + str(domain_with_escapes) + '\r\r')
d0 = "hw"
s0 = "site"
d1 = "cmc"
s1 = "sitename"

key = d0 + " in " + s0
match_string = '.*' + str(d1) + '.*' + str(s1) + '.*' + str(domain_with_escapes)
tmp_dict = dict()
tmp_dict['name~'] = str(match_string)
var1 = match_string

print("Dict tmp_dict is: " + str(tmp_dict) + '\r\r')

print("match_string is: " + str(match_string) + '\r\r' )
print("match_string without str is: " + match_string + '\r\r' )
print("var1 is: " + match_string + '\r\r' )

params_for = dict()
#params_for[key] = {'name~' :  str(match_string) }
params_for[key] = tmp_dict
print("Dict params_for is: " + str(params_for) + '\r\r')
I run it and get the same results.

Domain with escapes is: aw\.me.org
Dict tmp_dict is: {'name~': '.*cmc.*sitename.*aw\\.me.org'}
match_string is: .*cmc.*sitename.*aw\.me.org
match_string without str is: .*cmc.*sitename.*aw\.me.org
var1 is: .*cmc.*sitename.*aw\.me.org
Dict params_for is: {'hw in site': {'name~': '.*cmc.*sitename.*aw\\.me.org'}}
Reply
#6
I do know that '\' is special for line feed, carriage return, form feed, etc. and
was in use when I started programming back in the 1960's. Therefore needs to
be escaped itself.

This might be helpful: https://stackoverflow.com/questions/4020...-in-python
Reply
#7
Yes, I know the slash is a special character, and needs to be escaped. However, I'm not wanting to escape it; rather, I'm using it to escape the period in the domain name that it precedes. What's happening here (so far as I can tell) is that when I put in that slash to escape the period, the python interpreter is thinking I need to escape my slash, which makes the regular expression wrong. As the regexp is wrong, I'm not getting the search results I need.

That said, I'm putting in a workaround for now by putting in wildcards for the period, i.e. instead of \.aw\.me\.org, I've changed my code to use .*aw.*me.*org. Of course, that's more resource intensive than it has to be, but it seems the only way I can get this to work now.
Reply
#8
why not use raw string as suggested by Larz in post#2?
Reply
#9
The backslash is the escape character.  If you want to use a backslash, it therefore needs to be escaped... by a backslash.  Which is why almost all non-trival regexes are raw strings. 

So there's three options:
1) escape the backslashes
2) use a raw string
3) don't use a backslash at all, and instead let the regex module do the escaping.  If you don't want to use a raw string, this might be the best option for you.

>>> "spam\\.eggs"
'spam\\.eggs'
>>> r"spam\.eggs"
'spam\\.eggs'
>>> re.escape("spam.eggs")
'spam\\.eggs'
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python beginner that needs an expression added to existing script markham 1 736 Sep-04-2023, 05:24 AM
Last Post: Pedroski55
  [SOLVED] [BS] Why new tag only added at the end when defined outside the loop? Winfried 1 990 Sep-05-2022, 09:36 AM
Last Post: snippsat
  extra slashes in the network path for shutil.copy tester_V 3 3,828 Jun-02-2021, 07:57 AM
Last Post: supuflounder
  Extra slashes in path tester_V 3 2,321 Feb-13-2021, 09:38 PM
Last Post: tester_V
  Can property getters and setters have additional arguments? pjfarley3 2 3,074 Oct-30-2020, 12:17 AM
Last Post: pjfarley3
  httplib2 - how to see credentials added by add_credentials? MSV 2 2,190 Aug-05-2020, 12:24 PM
Last Post: MSV
  Additional buffer for Windows constantin01 0 1,420 Mar-31-2020, 10:24 AM
Last Post: constantin01
  How to show newly added column to csv johnson54937 3 2,243 Jan-07-2020, 04:01 AM
Last Post: Larz60+
  How to write a script to execute a program need passing additional input? larkypython 2 2,598 Nov-23-2019, 04:38 AM
Last Post: larkypython
  Print list items on separate lines with additional text DBS 2 6,182 Jan-11-2017, 02:57 AM
Last Post: DBS

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020