Python Forum
Letters with Accents Not Displaying - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Letters with Accents Not Displaying (/thread-19048.html)



Letters with Accents Not Displaying - Phylus - Jun-11-2019

My Python 2.7.10 code (in MacOS 10.14.5) replaces letters with accents such as á with strings such as \xc3\xa1. Letters display with accents in both my text file (in TextEdit) of Portuguese words and my (Mac) Terminal, both of which are encoded in Unicode (UTF-8). I've searched online and tried various things but nothing works.

This is my code:

f = open(filename,"r")
filelist = list(f)
n = filename + "_As_List"
l = open(n,"w")
print(filelist)
l.write(str(filelist))
How can I get letters with accents to display in the outputs of that code?


RE: Letters with Accents Not Displaying - nilamo - Jun-11-2019

You could try opening the files in utf-8.
Honestly, I hope that works, because python3 handles this automatically, and I've forgotten how to do it :/

open(filename, "r", encoding="utf8")


RE: Letters with Accents Not Displaying - Phylus - Jun-11-2019

Thank you, I think the solution may be along these lines. Unfortunately, this syntax gives the error message:

Error:
TypeError: 'encoding' is an invalid keyword argument for this function



RE: Letters with Accents Not Displaying - nilamo - Jun-11-2019

I was close. The encoding parameter in python3 didn't exist in python2, but instead the codecs (and io) module had an open function which did support it: https://docs.python.org/2/howto/unicode.html#reading-and-writing-unicode-data

import codecs
f = codecs.open('somefile', mode='r', encoding='utf-8')



RE: Letters with Accents Not Displaying - Phylus - Jun-11-2019

Thank you again. With that code, what happens is:

Actual result:

[u'o p\xe1ssaro\n']
instead of:

Desired result:

['o pássaro\n']
I can remove the leading u but the problem is the \xe1 instead of á.

(By the way 'o pássaro' is Portuguese for 'the bird'.)


RE: Letters with Accents Not Displaying - Phylus - Jun-12-2019

For reading files, the solution is indeed:

import codecs
f = codecs.open('somefile', mode='r', encoding='utf-8')
In the question example, the lists look wrong but if you work with the lists and output entries, the accents appear. Thanks to nilamo.