Python Forum

Full Version: Replace words in a file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have problems with words with backslash in a given text that have a meaning in python. And characters inside words are unwantedly replaced as well. I get the output:

reak
as aes
a aes uplet { c ( aes a )}
ar

"\break" gives "reak", \tuplet gives "uplet" with an unwanted TAB in front because of \t, "\bar" becomes "ar".
"des" should become "bes", but becomes "aes" because of the the code the ' d' : ' a' in th dictionary, that is only meant for replacing single characters standing isolated.

Is there a solution? Modification of the original given text is not a solution.

Thanks

#!/usr/bin/env python2
# -*- coding: utf-8 -*-
"""
Created on Tue Jan 21 14:01:45 2020
@author: bb
"""
import re
 
def replace_words(text, word_dic):
    """
    take a text and replace words that match a key in a dictionary with
    the associated value, return the changed text
    """
    rc = re.compile('|'.join(map(re.escape, word_dic)))
    def translate(match):
        return word_dic[match.group(0)]
    return rc.sub(translate, text)
 
 
str1 = \
""" 
 \break
 es des
  d des \tuplet { c ( des d )}
  \bar 
"""
    
word_dict = {
' \tuplet' : ' \tuplet' ,
'\return' : '\return' ,
'\break' : '\break' ,
'\bar' : '\bar ' ,
' es' : ' as' ,
' d' : ' a' ,
'des' : 'bes'
}

str2 = replace_words(str1, word_dict)
 
# test
print (str2)
How you escape backslash? With another backslash Smile

str1 = \ 
r"""                                         # r for raw string
 \break 
  es des 
   d des \tuplet { c ( des d )} 
   \bar  
"""                                                              
>>> str1                                                             
' \n \\break\n es des\n  d des \\tuplet { c ( des d )}\n  \\bar \n'
>>> str1.replace('\\break', '\\car')                                 
' \n \\car\n es des\n  d des \\tuplet { c ( des d )}\n  \\bar \n'
>>> print(str1.replace('\\break', '\\car'))                          

 \car
 es des
  d des \tuplet { c ( des d )}
  \bar 
not exactly answer to your question, but looking at the first 3 key:value pairs key and value are identical....
Also because \b, \t, \r are escape sequences, they are "translated" accordingly when print the respective string, e.g. \t becomes TAB
so you need to print(repr(str2)) to see what the actual result of your function is
Thanks. I know and have described the \-problem in my first mail. But an idea arose reading the responses. I might double the "\" to a "\\" by a simple document editor by replacing all "\" by "\\" and feed the resulting modified text into the program. I will report if it works soon. (I have some other problems to work with right now.)