'utf-8' codec can't decode byte 0xda in position 184: invalid continuation byte

karkas · (This post was last modified: Sep-02-2019, 09:45 PM by karkas.)

Hi everyone,

I'm getting this error and have been looking online but don't really understand for my specific case and don't really know why this could be happening.

This is the error : 'UnicodeDecodeError: 'utf-8' codec can't decode byte 0xda in position 184: invalid continuation byte

I'm trying to read a text file with the following lines

inFile = open(fileName, 'r', encoding="utf8")
fileList = []
for line in inFile:
    fileList.append(line)

What I'm reading is a simple SRT file. I created a program that takes an SRT file and fixes the timestamps to eliminate overlapping because the editor does this sometimes. This function does this correctly and doesn't have this problem when reading and, when I create the new, corrected file, I'm just copying the old file and replacing the lines with timestamps with the corrected ones. However, when I try to read the newly generated file to do a conversion to another format I have this problem. I've been working with functions that convert and manipulate this kind of files for a while, but I had never gotten this error, just a similar one that I can't remember now, that's why I used the encoding="utf8".

I don't really know what "position 184" means, none of the lines is even longer than 33 characters, and line 184 of the file is an empty line with only an EOL character.

I'm thinking it's the new timestamps I'm writing that have this problem, but have no clue which character may be. When I look for the character 0xda, I find it's a Ú; however, that character is being read normally in other instances and I'm not even overwriting it.

If some of you happen to have not seen an SRT file before, it looks like this:

9
00:00:15,377 --> 00:00:18,570
ESTAMOS HACIENDO
UN FASCINANTE EXPERIMENTO.

10
00:00:19,150 --> 00:00:20,280
AÚN LO ESCUCHO.

The lines where I do the replacement are the following:

inList[line] = hoursBegin + ':' + minutesBegin + ':' +  secondsBegin + ',' + millisecondsBegin + ' --> ' +\
            hoursEnd + ':' + minutesEnd + ':' + secondsEnd + ',' + millisecondsEnd + '\n'

Thanks in advance.

PD: Please excuse me if I'm not being very clear about some things, just let me know and I'll clarify. I've been working for long hours and I'm kind of stuck.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Search for multiple unknown 3 (2) Byte combinations in a file.	lastyle	7	1,320	Aug-14-2023, 02:28 AM Last Post: deanhystad
	UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in position 562: ord	ctrldan	23	4,805	Apr-24-2023, 03:40 PM Last Post: ctrldan
	UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd2 in position 16: invalid cont	Melcu54	3	4,924	Mar-26-2023, 12:12 PM Last Post: Gribouillis
	Decode string ?	JohnnyCoffee	1	813	Jan-11-2023, 12:29 AM Last Post: bowlofred
	extract only text strip byte array	Pir8Radio	7	2,922	Nov-29-2022, 10:24 PM Last Post: Pir8Radio
	[SOLVED] [Debian] UnicodeEncodeError: 'ascii' codec	Winfried	1	1,020	Nov-16-2022, 11:41 AM Last Post: Winfried
	sending byte in code?	korenron	2	1,115	Oct-30-2022, 01:14 PM Last Post: korenron
	UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 34: character	Melcu54	7	18,846	Sep-26-2022, 10:09 AM Last Post: Melcu54
	Byte Error when working with APIs	Oshadha	2	1,008	Jul-05-2022, 05:23 AM Last Post: deanhystad
	UnicodeEncodeError: 'ascii' codec can't encode character '\xfd' in position 14: ordin	Armandito	6	2,716	Apr-29-2022, 12:36 PM Last Post: Armandito

'utf-8' codec can't decode byte 0xda in position 184: invalid continuation byte

User Panel Messages

Announcements