here is what i am doing

Skaperen · Jun-24-2020, 03:37 AM

here is what i am doing that got me to ask some questions and do a bunch of googling. i am creating a class to open a file in a special way. it will also have some more flexible arguments. i am currently calling it ztopen(). i have made a more limited version that works and is now in use by a couple scripts.

it supports compression or decompression based on the extension in the file path ( '.gz' or '.bz2' or '.xz'). it support 2 positional arguments, the path and mode. it supports a number of new keyword arguments, append=, binary=, mode=, name=, read=, text=, write=. name= is another way to specify the path. mode= is an additional mode specification. the other 5 are logical flags for various individual modes.

for files being written, a temporary file is created in the same directory (so it ends up in the same file system). it is renamed to the intended name when the file is closed.

if the name is given by both positional argument and keyword argument, it must be the same path. if the paths are different, it will raise TypeError. the name may be given as an int representing a file descriptor just like open() does now. path strings may be given as str, bytes, or bytearray.

if the mode is given by both positional argument and keyword argument, the two are combined. this allows some mode letters in one and some in the other. duplicate mode letters are ignored. mode strings may be given as str, bytes, or bytearray.

the mode keyword arguments allow bool and numeric values. modes as separate keyword arguments may be mixed with mode strings if there is no conflict. conflicts will raise TypeError. the value None will be as if that argument was not given.

that is a lot of checking to do. at least it only has to do that once per file being opened.

**Gribouillis** · Jun-24-2020, 05:55 AM

If the name is not a bytearray, you could convert it with os.fspath() to accept pathlike objects. This also saves a type check for bytes or str because you know that the value returned by os.fspath is one of them.

Skaperen · Jun-25-2020, 02:54 AM

i'll have to think about dropping support for bytearray. many of my functions support it.

Skaperen · Jun-25-2020, 05:48 AM

what does os.fspath() accomplish?

**Gribouillis** · (This post was last modified: Jun-25-2020, 06:34 AM by Gribouillis.)

If the object is not already a str or a bytes, it calls the object's __fspath__() method which returns a str or a bytes representing the filesystem path of the object. This works for "pathlike" objects such as pathlib.Path or potentially user defined types that want to implement __fspath__. Using this makes your function ready to handle more general ways of pointing to a file in the filesystem.

Skaperen · Jun-26-2020, 01:04 AM

i'm trying to understand this method would be used would programmers normally use it before they call open()? what advantage would there be for ztopen() to pass name strings through it? just another feature they won't need to code?

**Gribouillis** · Jun-26-2020, 07:30 AM

It is a very good question, my idea is that some functions that take a path argument in the standard library tend to accept now a pathlike argument. This is the case for the built-in open() function. Clearly, this has not yet been extended to all the open() functions in the library.

That's a general problem with type checking. Suppose that 3 functions A(), B(), C() all take a path argument and perform a type checking on their argument. If C calls B and B calls A, the type checking will be done 3 times.

You're telling me that it should be the caller's responsibility to call os.fspath() to handle pathlike arguments, but then your function will check again that the argument is a str or a bytes, which is not necessary. The advantage of calling os.fspath() in your function is that it replaces the type checking and at the same time it makes the function more general.

Skaperen · (This post was last modified: Jun-26-2020, 05:52 PM by Skaperen.)

while thinking out some testing logic, i keep ending up with a lot of redundancy. but, i keep thinking to myself that at least this only happens once per file to be opened. that can still impact scripts intended to do things like scanning trees and opening every file to back them all up.

i'm now thinking i should avoid having the ability to mix the ways of providing various arguments and require one way be used in the API. then i can start by determine which way is being used, raise TypeError if 2 or 3 are used at same time (match or no match), and select the next test logic based on which way.

1. name and mode in positional arguments
2. name= and mode= in keyword arguments
3. name= and individual modes in keyword arguments:
    append=  (like 'a')
    binary=  (like 'b')
    exwrite= (like 'x')
    read=    (like 'r')
    text=    (like 't')
    update=  (like '+')
    write=   (like 'w')
these want true/false, anything that can follow if, not.

Skaperen · Jun-29-2020, 05:54 PM

(Jun-24-2020, 05:55 AM)Gribouillis Wrote: If the name is not a bytearray, you could convert it with os.fspath() to accept pathlike objects. This also saves a type check for bytes or str because you know that the value returned by os.fspath is one of them.

i still need to check which it is since i am appending a unique string to the file name to make the temporary file name.

here is what i am doing

User Panel Messages

Announcements