Python Forum

Full Version: strange string
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
i've got some code in a big function near the end:
    ...
    doc_str = getattr(function,'__doc__',None)
    print(len(doc_str),'doc_str =',repr(doc_str),flush=1)
    print(len(doc_str),'doc_str =',repr(doc_str[:150]),flush=1)
    return doc_str
the output from the 2nd print() call is:
Output:
0 doc_str = ''
in previous versions of the code it was getting data in doc_str so i put the 1st print() call back in. originally the change was adding len(doc_str), and [:150]. now with 2 print calls, the 1st prints out the quoted (by repr()) doc string i expect, but the len(doc_str) comes out 0. so what is happening is that after the getattr() call gets the doc string and assigns it to doc_str and the 1st print() call outputs 0 doc_str = ' followed by over 100 lines of documentation (looks like the correct documentation) then the ending quote from repr(). at this point it is curious how the len() can be 0 with lots of characters there. then the 2nd print() call with the sliced string (originally i added the slicing so as not to have 100+ lines) outputs:
Output:
0 doc_str = ''
what can cause this weird effect? it's losing data and len() is just wrong. the caller is getting a null string, too. it should be getting that documentation i see from the 1st print() call (less the quotes that repr() is adding in the print() call).

i added code r = repr(doc_str) to assign the repr() result to r then printed the length of r and got 3079. so repr() got 3077 characters and len() got 0. could this string be internally corrupt? it originally comes from the botocore library which has been working OK for me. this is the first time i am extracting doc strings from its client methods. i don't know if there is any C/binary code in botocore to mess up these stings.
Try to print type(doc_str) perhaps.
(Apr-20-2019, 07:56 AM)Gribouillis Wrote: [ -> ]Try to print type(doc_str) perhaps.

i get <class 'botocore.docs.docstring.ClientMethodDocstring'> from that. so i probably should do str() on it. still, i expect attribute __doc__ to be a str already.

that seems to do it. len(str(doc_str)) = 3079

it looks like str() is doing the same thing as repr() including changing newline characters to the 2-str '\n' (which would be '\\n' in source code literal to create the same 2-str).

i tried copy.copy() and it just copies that class object as-is (e.g. the copy behaves the same).

i did dir(doc_str) and its attributes look like str attributes.

looking closer it looks like it has all str attributes plus a few more. so it's trying to be like a str in some way.
Skaperen Wrote:i expect attribute __doc__ to be a str already.
I agree, if it is not a str, it looks like a flaw in the design of this botocore module. Try isinstance(doc_str, str) and also inspect.getmro(type(doc_str)) for more information.