Problem with readlines() assignment

Sunioj · (This post was last modified: Oct-26-2019, 11:52 PM by Sunioj.)

Hello there,
I have an assignment which I need to make a loop that reads each line in a text file and then run a condition against the line of text. If it has special characters, it is rejected. If it is alphanumeric, then it is valid:

filename = open('strings.txt','r')
data = filename.readlines()

for line in data:
    stripped_line = line.strip()
    if stripped_line.isalnum():
       print (stripped_line + " was ok.")
    else:
       print (stripped_line + " was invalid.")
filename.close()

It runs perfectly. However! The place where the students write the code is embedded into the course webpage, and it is giving me this annoying error everytime:

Output:ï»¿5345m34534l was invalid.
no2no123non4 was ok.
noq234n5ioqw#% was invalid.
%#""SGMSGSER was invalid.
doghdp5234 was ok.
sg,dermoepm was invalid.
43453-frgsd was invalid.
hsth())) was invalid.
bmepm35wae was ok.
vmopaem2234+0+ was invalid.
gsdm12313 was ok.
gswrgsrdgrsgsig45 was ok.
)/(/)(#=%#)%/ was invalid.
++-+-+--+--+-+>-<+-<<_<-+>>++. was invalid.

Incorrect output: your program printed "ï»¿5345m34534l", but should have printed "5345m34534l"

I could not for the life of me figure out why the characters ï»¿ were appearing in front of the line. I googled it and found that it some kind of text document specific formatting declaration (if I am correct).

There is a section in this course which does outline the need to declare some code as an interpreter in some cases at the top of the python code.

Such as:

# -*- coding: UTF8 -*-

But it did not work. What is even more confusing is that if I type to print the variable storing the readlines, then it appears that the special characters are physically written to the first line in the text file:

print(data)

Output:
['ï»¿5345m34534l\n', 'no2no123non4\n', 'noq234n5ioqw#%\n', '%#""SGMSGSER\n', 'doghdp5234\n', 'sg,dermoepm\n', '43453-frgsd\n', 'hsth()))\n', 'bmepm35wae\n', 'vmopaem2234+0+\n', 'gsdm12313\n', 'gswrgsrdgrsgsig45\n', ')/(/)(#=%#)%/\n', '++-+-+--+--+-+>-<+-<<_<-+>>++.\n']

Some help to solve this would be awesome. This seems like something really basic which I can't figure out, and spent a lot of time on it.

Thanks in advance,
Koji

This also does not work:

filename = open('strings.txt', encoding='utf-8-sig')

***snippsat*** · (This post was last modified: Oct-27-2019, 12:00 AM by snippsat.)

strings.txt is saved with Byte Order Mark(BOM).
This should fix it.

with open('strings.txt', encoding='utf-8-sig') as f:
    data = f.readlines()

If you have contact with editor of file tell them to not use BOM's when save file bad practice(probably they do not know
that use BOM).
Example in Notepad++ under format Use UTF-8(Without Byte Order Mark(BOM)) to make it clear what it dos when save to file.
Other Editors eg VS Code is UTF-8 default without BOM.

Sunioj · Oct-27-2019, 12:07 AM

(Oct-27-2019, 12:00 AM)snippsat Wrote: strings.txt is saved with Byte Order Mark(BOM).
This should fix it.
with open('strings.txt', encoding='utf-8-sig') as f:
    data = f.readlines()
If you have contact with editor of file tell them to not use BOM's when save file bad practice(probably they do not know
that use BOM).
Example in Notepad++ under format Use UTF-8(Without Byte Order Mark(BOM)) to make it clear what it dos when save to file.
Other Editors eg VS Code is UTF-8 default without BOM.

It didn't work : ( I think something is really messed up with the online web course. I will send them an email.

***snippsat*** · (This post was last modified: Oct-27-2019, 12:17 AM by snippsat.)

That's strange,if i do a quick test.

with open('bom.txt') as f:
    data = f.read()
    print(data)

Output:
ï»¿hello world

with open('bom.txt', encoding='utf-8-sig') as f:
    data = f.read()
    print(data)

Output:
hello world

You can fix file yourself before read into Python with eg Notpad++.
Notpad++ is what i used to make the file with BOM in test over.

Sunioj · (This post was last modified: Oct-27-2019, 12:28 AM by Sunioj.)

Here is what this course web site says:

[Image: Screenshot-from-2019-10-27-03-25-272eec01dc99da6142.png]

I have linux, so i need to check what settings I can save with this text editor... will get back here.

Thank you

**perfringo** · (This post was last modified: Oct-27-2019, 06:20 AM by perfringo.)

Some observation about task description (and provided solution): "I need to make a loop that reads each line in a text file and then run a condition against the line of text. If it has special characters, it is rejected. If it is alphanumeric, then it is valid"

One can iterate over file lines directly, no need to use .readlines() (slurp data in).

strip() strips whitespaces from both ends - if row starts (or ends) with whitespace this will give false isalnum() result. Correct way should be .rstrip('\n') - one should only strip newline at the end of row, not spaces.

Space is not alphanumeric:

>>> ' a'.isalnum()
False

And f-strings have been along for quite a long time so I am not 'impressed' by print statement.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	[Solved] Using readlines to read data file and sum columns	Laplace12	4	3,556	Jun-16-2021, 12:46 PM Last Post: Laplace12
	Problem with readlines() and comparisons	dudewhoneedshelp	2	2,171	Jul-23-2020, 10:21 AM Last Post: DeaD_EyE
	[Python3] Trailing newline in readlines.	LWFlouisa	4	4,858	Mar-10-2020, 09:57 AM Last Post: perfringo
	readline() and readlines()	rpaskudniak	9	30,100	Nov-21-2017, 07:39 PM Last Post: metulburr
	Referrencing before assignment problem	martan45	1	3,049	Jul-30-2017, 09:46 PM Last Post: DeaD_EyE

Problem with readlines() assignment

User Panel Messages

Announcements