Python Forum
'utf-8' codec can't decode byte 0xda in position 184: invalid continuation byte
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
'utf-8' codec can't decode byte 0xda in position 184: invalid continuation byte
#1
Hi everyone,

I'm getting this error and have been looking online but don't really understand for my specific case and don't really know why this could be happening.

This is the error : 'UnicodeDecodeError: 'utf-8' codec can't decode byte 0xda in position 184: invalid continuation byte

I'm trying to read a text file with the following lines

inFile = open(fileName, 'r', encoding="utf8")
fileList = []
for line in inFile:
    fileList.append(line)
What I'm reading is a simple SRT file. I created a program that takes an SRT file and fixes the timestamps to eliminate overlapping because the editor does this sometimes. This function does this correctly and doesn't have this problem when reading and, when I create the new, corrected file, I'm just copying the old file and replacing the lines with timestamps with the corrected ones. However, when I try to read the newly generated file to do a conversion to another format I have this problem. I've been working with functions that convert and manipulate this kind of files for a while, but I had never gotten this error, just a similar one that I can't remember now, that's why I used the encoding="utf8".

I don't really know what "position 184" means, none of the lines is even longer than 33 characters, and line 184 of the file is an empty line with only an EOL character.

I'm thinking it's the new timestamps I'm writing that have this problem, but have no clue which character may be. When I look for the character 0xda, I find it's a Ú; however, that character is being read normally in other instances and I'm not even overwriting it.

If some of you happen to have not seen an SRT file before, it looks like this:

9
00:00:15,377 --> 00:00:18,570
ESTAMOS HACIENDO
UN FASCINANTE EXPERIMENTO.

10
00:00:19,150 --> 00:00:20,280
AÚN LO ESCUCHO.


The lines where I do the replacement are the following:


inList[line] = hoursBegin + ':' + minutesBegin + ':' +  secondsBegin + ',' + millisecondsBegin + ' --> ' +\
            hoursEnd + ':' + minutesEnd + ':' + secondsEnd + ',' + millisecondsEnd + '\n'
Thanks in advance.

PD: Please excuse me if I'm not being very clear about some things, just let me know and I'll clarify. I've been working for long hours and I'm kind of stuck.
Reply


Messages In This Thread
'utf-8' codec can't decode byte 0xda in position 184: invalid continuation byte - by karkas - Sep-02-2019, 09:45 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Search for multiple unknown 3 (2) Byte combinations in a file. lastyle 7 1,320 Aug-14-2023, 02:28 AM
Last Post: deanhystad
Question UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in position 562: ord ctrldan 23 4,805 Apr-24-2023, 03:40 PM
Last Post: ctrldan
  UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd2 in position 16: invalid cont Melcu54 3 4,924 Mar-26-2023, 12:12 PM
Last Post: Gribouillis
  Decode string ? JohnnyCoffee 1 813 Jan-11-2023, 12:29 AM
Last Post: bowlofred
  extract only text strip byte array Pir8Radio 7 2,922 Nov-29-2022, 10:24 PM
Last Post: Pir8Radio
  [SOLVED] [Debian] UnicodeEncodeError: 'ascii' codec Winfried 1 1,020 Nov-16-2022, 11:41 AM
Last Post: Winfried
  sending byte in code? korenron 2 1,115 Oct-30-2022, 01:14 PM
Last Post: korenron
  UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 34: character Melcu54 7 18,846 Sep-26-2022, 10:09 AM
Last Post: Melcu54
  Byte Error when working with APIs Oshadha 2 1,008 Jul-05-2022, 05:23 AM
Last Post: deanhystad
  UnicodeEncodeError: 'ascii' codec can't encode character '\xfd' in position 14: ordin Armandito 6 2,716 Apr-29-2022, 12:36 PM
Last Post: Armandito

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020