![]() |
Manipulating files Python 2.7 - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Manipulating files Python 2.7 (/thread-591.html) |
Manipulating files Python 2.7 - hugobaur - Oct-21-2016 Folks, I have a difficulty here to manipulate files. Objective: I'm developing a script that needs to create folders and copy files from a source, the script reads the user the number of months to create. The script replicates the files equal the number of months to be created, if in 2017 need to change a string to a .txt file into the directory. original string within the file: ANO INICIO DO ESTUDO 2016 String is to be changed: ANO INICIO DO ESTUDO 2017 PS: The file to be changed is not the original but a copy It is possible to change a line from a txt file? What I need: Read program source file and play to an array identifying a portion of a string within the array modify the string if it is found to be part of the string delete the source file and write another with the same name, or simply change a string (a word in a row) within the source file. FILENAME_NEWAVE = Path of the source file STRING_DGER = String to be searched FILE_DATE = Year This is not working, you are writing in the source file def find_word_in_file_dger(FILENAME_NEWAVE, STRING_DGER, FILE_DATE): f = open(FILENAME_NEWAVE, "r+") file_array = f.readlines() for i in file_array: if i.find(STRING_DGER.encode('utf-8')): f.write(i) else: print ("TO LENDO O ARRAY") if FILE_DATE == "2016": continue else: i.replace(STRING_DGER, "ANO INICIO DO ESTUDO " + FILE_DATE) f.write(i) print("TO ESCREVENDO A LINHA CORRETAMENTE MLK!! ") return i f.close() return False Correct script Edit: I have fixed it,mark all code next time an push "Remove Formatting" button. Then "Insert Python tag" button. RE: Manipulating files Pytohn 2.7 - Ofnuts - Oct-21-2016 1) it's easier to understand what you want to do if you use English names for your variables 2) it's less painful on the eyes if you use lowercase 3) I don't expect a function call find_word_in_file() to write the file. 4) on the whole, you should never read and write the same file. There is only one read-write pointer, so once you have done the readlines(), that pointer is at the end of the file, and the lines you write are appended at the end of the file. You could use seek statements to force the pointer where you want it, buy this would work only in this very specific case where you replace a string with a string of the same length. The usual way is to open your source in read mode, and open a second file in write mode (use a temporary name created with tempfile.mkstemp), copy the data over (with possible modifications), close that temporary file, then erase the source file and rename the temporary file(*). (*) Even safer: close temp, rename source to some temp name, rename temp from source, erase source. RE: Manipulating files Pytohn 2.7 - hugobaur - Oct-25-2016 Thanks to the tips. I thought like you, my dificulty it's the python. So, i'm create a program and works. Thanks awnser me! def dger_ano(FILE_DATE, ORIGEM, DESTINO): cache = None string = "ANO INICIO DO ESTUDO 2016" #str(int(FILE_DATE) - 1) with open(ORIGEM , "r") as f: cache = f.read() new_file = re.sub(string, "ANO INICIO DO ESTUDO {}".format(FILE_DATE), cache) if new_file: with open(DESTINO + "/temporario.txt", "w") as f: f.write(new_file) os.remove(DESTINO + "/DGER.dat") os.rename(DESTINO + '/temporario.txt',DESTINO + '/DGER.dat')In Python 2 : has the accent on the file appears this error message, "UnicodeDecodeError: 'ascii' codec can't decode byte 0xba in position 1355: ordinal not in range(128) " Python 3 : Works! BUT, i need to work on Python 2 RE: Manipulating files Python 2.7 - Ofnuts - Oct-25-2016 Fairly well explained here. In Python2, file.read() reads bytes as single-byte characters that may have to real meaning. If you use characters that aren't in the ASCII set (ASCII codes up to 127, which excludes accented characters) you have to use the 'unicode' type that behaves like a string but can contain non-ASCII characters. To go from the string of single byte to unicode you decode it: # read in the file contents iso=open('iso-8859-15.txt').read() utf=open('utf-8.txt').read() # this is how they look, one <str> character for each byte in the source file print 'ISO:', repr(iso) print 'UTF:', repr(utf) # transform them to unicode, specifying the appropriate encoding unicodeISO=unicode(iso,encoding='iso-8859-15') unicodeUTF=unicode(utf,encoding='UTF-8') # Now, as unicode strings, they are identical print repr(unicodeISO),unicodeISO print repr(unicodeUTF),unicodeUTF(the two data files attached) You may wonder why the Unicode string looks like the ISO one. It's an optical illusion. Of course the people who defined Unicode didn't completely reinvent the wheel, and integrated as many existing encoding as feasible. So the numbers that encode characters in ISO-8859 and Unicode can be the same. However, the first is a one-byte 0xe9 and the second is really the Unicode +00E9. Needless to say, this means that you have to know in advance the encoding used to encode the files... On the other hand, there aren't that many encodings for Romance languages, so it will likely be either UTF-8 or some variant of ISO-8859. RE: Manipulating files Python 2.7 - hugobaur - Oct-31-2016 Hi, i can't make this work. I use your code to read my file, works: # read in the file contents iso = open('E:/ENEL/Modelos/NW201610/DGER.dat').read() utf = open('E:/ENEL/Modelos/NW201610/DGER.dat').read() # this is how they look, one <str> character for each byte in the source file print 'ISO:', repr(iso) print 'UTF:', repr(utf) # transform them to unicode, specifying the appropriate encoding unicodeISO = unicode(iso, encoding='iso-8859-15') #unicodeUTF = unicode(utf, encoding='UTF-8') # Now, as unicode strings, they are identical print repr(unicodeISO), unicodeISO #print repr(unicodeUTF), unicodeUTFBut my original function does not work: def dger_ano(FILE_ORIGEM, FILE_DATE, ORIGEM, DESTINO): #ORIGEM: ../file.txt | DESTINO: ../../ cache = None string = "YEAR " + FILE_ORIGEM with open(ORIGEM , "r") as f: cache = f.read() unicodeISO = unicode(cache, encoding='iso-8859-15') print ('ISO:', repr(unicodeISO)) new_file = re.sub(string, "ANO INICIO DO ESTUDO {}".format(FILE_DATE), cache) if new_file: with open(DESTINO + "/temporario.txt", "w") as unicodeISO: unicodeISO.write(new_file) os.remove(DESTINO + "/DGER.dat") os.rename(DESTINO + '/temporario.txt',DESTINO + '/DGER.dat')Edit adim: Mark all code an push "Remove Formatting" button next time. RE: Manipulating files Python 2.7 - snippsat - Oct-31-2016 The rule for Unicode is same encoding all the way in and out. So for Python 2.x use codecs or newer io,Python 3.x has this build in. Set utf-8 in first line,Python 2.x has ASCII as default encoding. Then it look like this. Test input iso.txt: Déjà vu peut-être... # -*- coding: utf-8 -*- import codecs with codecs.open("iso.txt", encoding='utf-8') as f: uni = f.read() with codecs.open("iso_out.txt", 'w', encoding='utf-8') as f_out: f_out.write(uni)iso_out.txt:
RE: Manipulating files Python 2.7 - hugobaur - Nov-01-2016 Snippsat, thanks to respond me. I tried that: import codecs with codecs.open("E:/ENEL/Modelos/NW201610/DGER.dat", encoding='utf-8') as f: uni = f.read() with codecs.open("iso_out.txt", 'w', encoding='utf-8') as f_out: f_out.write(uni) I put my file on attachment.So, i just changed that: encoding='utf-8' for that: encoding='iso-8859-15' and it works!!!! I don't know why, but it work. Thanks snippsat, Ofnuts! |