csv troubles - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: csv troubles (/thread-28997.html) |
csv troubles - DPaul - Aug-13-2020 Hi, I downloaded a csv database with names and dates etc.. I can read it partially, line per line, split on ',' and get the values i want; Except that some of the lines, especially the first name of the person, have strange chars. I cannot even read past the first occurrence like this. How can i read past these chars and not loose the rest of the line ? 1924,"DUPONT, Clément",FRA,Men,Rugby,Silver f = open(file,'r') for line in f: try: l = line[:-1] l = l.split(',') except: pass f.close()
RE: csv troubles - Gribouillis - Aug-13-2020 You are trying to decode the file with the codec cp1252. It is probably encoded in utf8 or iso8859-1. Use the encoding parameter of the open() function (or io.open, or codecs.open). You can also use the chardet command to guess the file's encoding. RE: csv troubles - DPaul - Aug-13-2020 So I installed chardet (who invents these names ?), and i got the result: It is not a surprise it's an encoding problem, but i thought utf-8 is python's standard, for reading.When i declare my file as such, i discover some wonderful first names like Désiré, and Frédéric... Thanks for your help, Paul RE: csv troubles - Gribouillis - Aug-13-2020 You could perhaps use the utf8 mode, as described here RE: csv troubles - DPaul - Aug-13-2020 (Aug-13-2020, 03:34 PM)Gribouillis Wrote: You could perhaps use the utf8 mode, as described here You learn a new thing every day. Some days, more than one. Thx, Paul |