Posts: 4,653
Threads: 1,496
Joined: Sep 2016
i am getting this:
Output: Traceback (most recent call last):
File "amijsonxz_get.py", line 83, in <module>
all_regions = [x.strip()for x in f]
File "amijsonxz_get.py", line 83, in <listcomp>
all_regions = [x.strip()for x in f]
OSError: read() should have returned a bytes object, not 'str'
the file is opened with mode "rt", so i am expecting str. the file is text, a list of AWS region names. should i use binary mode?
you want to see code? you can in this case.
http://ipal.net/python/amijsonxz_get.py
http://ipal.net/python/zopen.py
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Posts: 1,583
Threads: 3
Joined: Mar 2020
It isn't a core python error string. I imagine it comes from the zopen() stuff above it.
1 |
from zopen import openz_read,zfiles_exist,zopen
|
Where does that come from?
Posts: 4,653
Threads: 1,496
Joined: Sep 2016
Aug-24-2021, 02:23 AM
(This post was last modified: Aug-24-2021, 02:23 AM by Skaperen.)
(Aug-24-2021, 01:20 AM)bowlofred Wrote: Where does that come from? see the 2nd link. i wrote that. it's been working well for other things. but, of course, that doesn't rule out something wrong in there. it literally comes from my development directory. both files are in there. then other link is a script to download AWS AMI data and store it in compressed JSON. zopen is for the compression.
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Posts: 4,653
Threads: 1,496
Joined: Sep 2016
the thing i am trying to understand is what wants bytes from read. this file is not compressed so zopen should be doing a plain open(). and it should be doing this in text mode which would, presumably, be having read() return strings.
the only cases i know where opening the file needs binary is if one or more compression layers will be added or the caller requests binary mode. i got errors like this early on when a coding error opened the file in text mode when compression was added on top of it. that resulted in the compression layer reading the other file layer assuming binary and getting str.
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Posts: 1,583
Threads: 3
Joined: Mar 2020
Not sure I understand, but it looks like although simple read() calls appear to work, the iterator object from zopen() doesn't work properly in text mode. The same would work fine for open() . I don't know where to start on this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
from zopen import zopen
afn = "afn.txt"
print ( "Binary read zopen: " , end = "")
f = zopen(afn, "rb" )
print ( len (f.read()))
f.close()
print ( "Text read zopen: " , end = "")
f = zopen(afn, "rt" )
print ( len (f.read()))
f.close
print ( "Binary iterator zopen: " , end = "")
f = zopen(afn, "rb" )
print ( next ( iter (f)))
f.close
print ( "Text iterator open: " , end = "")
f = open (afn, "rt" )
print ( next ( iter (f)))
f.close
print ( "Text iterator zopen: " , end = "")
f = zopen(afn, "rt" )
print ( next ( iter (f)))
f.close
|
Output: Binary read zopen: 16
Text read zopen: 16
Binary iterator zopen: b'region1\n'
Text iterator open: region1
Text iterator zopen: Traceback (most recent call last):
File "/ska/runit.py", line 26, in <module>
print(next(iter(f)))
OSError: read() should have returned a bytes object, not 'str'
Posts: 4,653
Threads: 1,496
Joined: Sep 2016
zopen() is calling open() in this case. no compression is involved. so, zopen must be mangling it, somehow. i replaced the comprehension loop (iterating the file object) with a single read() (intending to do .split('\n') but did not do that, yet) and that worked. i will try other forms of iteration, next. there are attached methods near the end of file zopen.py
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Posts: 4,653
Threads: 1,496
Joined: Sep 2016
it also fails when iterated like:
1 2 3 4 5 |
f = zopen(fname, 'r' )
a = []
for x in f:
a.append(x.strip())
f.close()
|
and this happens even with non-compressed files where zopen just calls open and returns the file object. i can't perceive how doing that can make it be a different kind of object. but i am digging into zopen to be sure it isn't doing something to the object, especially something that could make the iterator think it is a binary file. there is logic in zopen to open in binary if it is going to stack a compression object on top (does decompression for read). but this appears to be working right as the read file object is returning str as it should.
if i exhaust this search my next task will be to build a minimal class that replicates the problem.
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Posts: 8,165
Threads: 160
Joined: Sep 2016
Check this if it may point you in the right direction
https://gitlab.idiap.ch/bob/bob.measure/-/issues/23
Looks like similar issue - your _zio class inherits from io.IOBase
Posts: 1,838
Threads: 2
Joined: Apr 2017
(Sep-02-2021, 05:44 AM)Skaperen Wrote: if i exhaust this search my next task will be to build a minimal class that replicates the problem.
Better yet, write a test. That will help you figure out the input leading to the problem and making that test pass demonstrates that the bug has been fixed. It'll then give you regression safety - if you change something that causes the test to fail, you'll know you've broken the code.
Posts: 4,653
Threads: 1,496
Joined: Sep 2016
i was already trying to write a test. it didn't support any compression and just did what _zio does for an uncompressed file. so i just let _zio be the test. i'll try, again, and make something that doesn't have the mode and option logic. or the extra function layers (their only purpose is to have function names serve as different defaults for the tempname= option since i have code around here that uses both features by name. i really should take ztopen() out since i added the tempname= option and eliminate the function layers and rename _zio to zopen.
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
|