Python Forum
Problem with readlines() assignment
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Problem with readlines() assignment
#1
Hello there,
I have an assignment which I need to make a loop that reads each line in a text file and then run a condition against the line of text. If it has special characters, it is rejected. If it is alphanumeric, then it is valid:

filename = open('strings.txt','r')
data = filename.readlines()

for line in data:
    stripped_line = line.strip()
    if stripped_line.isalnum():
       print (stripped_line + " was ok.")
    else:
       print (stripped_line + " was invalid.")
filename.close()
It runs perfectly. However! The place where the students write the code is embedded into the course webpage, and it is giving me this annoying error everytime:

Output:
5345m34534l was invalid. no2no123non4 was ok. noq234n5ioqw#% was invalid. %#""SGMSGSER was invalid. doghdp5234 was ok. sg,dermoepm was invalid. 43453-frgsd was invalid. hsth())) was invalid. bmepm35wae was ok. vmopaem2234+0+ was invalid. gsdm12313 was ok. gswrgsrdgrsgsig45 was ok. )/(/)(#=%#)%/ was invalid. ++-+-+--+--+-+>-<+-<<_<-+>>++. was invalid.
Incorrect output: your program printed "5345m34534l", but should have printed "5345m34534l"

I could not for the life of me figure out why the characters  were appearing in front of the line. I googled it and found that it some kind of text document specific formatting declaration (if I am correct).

There is a section in this course which does outline the need to declare some code as an interpreter in some cases at the top of the python code.

Such as:

# -*- coding: UTF8 -*-
But it did not work. What is even more confusing is that if I type to print the variable storing the readlines, then it appears that the special characters are physically written to the first line in the text file:

print(data)


Output:
['5345m34534l\n', 'no2no123non4\n', 'noq234n5ioqw#%\n', '%#""SGMSGSER\n', 'doghdp5234\n', 'sg,dermoepm\n', '43453-frgsd\n', 'hsth()))\n', 'bmepm35wae\n', 'vmopaem2234+0+\n', 'gsdm12313\n', 'gswrgsrdgrsgsig45\n', ')/(/)(#=%#)%/\n', '++-+-+--+--+-+>-<+-<<_<-+>>++.\n']
Some help to solve this would be awesome. This seems like something really basic which I can't figure out, and spent a lot of time on it.

Thanks in advance,
Koji

This also does not work:

filename = open('strings.txt', encoding='utf-8-sig')
Reply
#2
strings.txt is saved with Byte Order Mark(BOM).
This should fix it.
with open('strings.txt', encoding='utf-8-sig') as f:
    data = f.readlines()
If you have contact with editor of file tell them to not use BOM's when save file bad practice(probably they do not know
that use BOM).
Example in Notepad++ under format Use UTF-8(Without Byte Order Mark(BOM)) to make it clear what it dos when save to file.
Other Editors eg VS Code is UTF-8 default without BOM.
Reply
#3
(Oct-27-2019, 12:00 AM)snippsat Wrote: strings.txt is saved with Byte Order Mark(BOM).
This should fix it.
with open('strings.txt', encoding='utf-8-sig') as f:
    data = f.readlines()
If you have contact with editor of file tell them to not use BOM's when save file bad practice(probably they do not know
that use BOM).
Example in Notepad++ under format Use UTF-8(Without Byte Order Mark(BOM)) to make it clear what it dos when save to file.
Other Editors eg VS Code is UTF-8 default without BOM.

It didn't work : ( I think something is really messed up with the online web course. I will send them an email.
Reply
#4
That's strange,if i do a quick test.
with open('bom.txt') as f:
    data = f.read()
    print(data)
Output:
hello world
with open('bom.txt', encoding='utf-8-sig') as f:
    data = f.read()
    print(data)
Output:
hello world
You can fix file yourself before read into Python with eg Notpad++.
Notpad++ is what i used to make the file with BOM in test over.
Reply
#5
Here is what this course web site says:

[Image: Screenshot-from-2019-10-27-03-25-272eec01dc99da6142.png]

I have linux, so i need to check what settings I can save with this text editor... will get back here.

Thank you
Reply
#6
Some observation about task description (and provided solution): "I need to make a loop that reads each line in a text file and then run a condition against the line of text. If it has special characters, it is rejected. If it is alphanumeric, then it is valid"

One can iterate over file lines directly, no need to use .readlines() (slurp data in).

strip() strips whitespaces from both ends - if row starts (or ends) with whitespace this will give false isalnum() result. Correct way should be .rstrip('\n') - one should only strip newline at the end of row, not spaces.

Space is not alphanumeric:

>>> ' a'.isalnum()
False
And f-strings have been along for quite a long time so I am not 'impressed' by print statement.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  [Solved] Using readlines to read data file and sum columns Laplace12 4 3,556 Jun-16-2021, 12:46 PM
Last Post: Laplace12
  Problem with readlines() and comparisons dudewhoneedshelp 2 2,171 Jul-23-2020, 10:21 AM
Last Post: DeaD_EyE
  [Python3] Trailing newline in readlines. LWFlouisa 4 4,858 Mar-10-2020, 09:57 AM
Last Post: perfringo
  readline() and readlines() rpaskudniak 9 30,100 Nov-21-2017, 07:39 PM
Last Post: metulburr
  Referrencing before assignment problem martan45 1 3,049 Jul-30-2017, 09:46 PM
Last Post: DeaD_EyE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020