Python Forum

Full Version: Strange Characters in JSON returned string
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,

I'm getting some strange characters at the beginning of an string that I am setting up for a request payload.

In this example I am trying to read a text file containing one data item and then enumerate it.

with open('list.txt', 'r') as f:
             d = {k: v.strip() for (k, v) in enumerate(f, start=1)}
             data = json.dumps(d)
    
             with open('out.txt', 'w') as f_out:  
             f_out.write(data)
the resulting out.txt contains

{"1": "\u00ef\u00bb\u00bf10841911101489"}

the return should read :

{"1": "10841911101489"}

Can anyone help please?

Many thanks,

Fiorano
hard to say without knowing what's in list.txt, can you provide that?
It seems as though its due to the encoding of my source txt file.

By default it is output from my source app as a UTF-8 file. If I open and save this as an ANSI text file the process works. Is there anything I can do at the start of my python app to amend the encoding?

Thanks for your help
You read a file, which has a UTF8 encoding with Byte Order Mark: EF BB BF

Use as encoding utf-8-sig when you read the file.
Then the BOM is stripped away.

The prefix \uxxxx is just a representation for Unicode code points in Json and also for Python.
What you see, are the first three bytes, which defines the Byte Order. Usually this is not used.
I guess you must seek for documents, which are still using the Byte Order Mark.
(Dec-02-2019, 03:56 PM)DeaD_EyE Wrote: [ -> ]You read a file, which has a UTF8 encoding with Byte Order Mark: EF BB BF

Use as encoding utf-8-sig when you read the file.
Then the BOM is stripped away.

The prefix \uxxxx is just a representation for Unicode code points in Json and also for Python.
What you see, are the first three bytes, which defines the Byte Order. Usually this is not used.
I guess you must seek for documents, which are still using the Byte Order Mark.

Perfect Thank You!!!