Python Forum
Strange Characters in JSON returned string
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Strange Characters in JSON returned string
#1
Hi,

I'm getting some strange characters at the beginning of an string that I am setting up for a request payload.

In this example I am trying to read a text file containing one data item and then enumerate it.

with open('list.txt', 'r') as f:
             d = {k: v.strip() for (k, v) in enumerate(f, start=1)}
             data = json.dumps(d)
    
             with open('out.txt', 'w') as f_out:  
             f_out.write(data)
the resulting out.txt contains

{"1": "\u00ef\u00bb\u00bf10841911101489"}

the return should read :

{"1": "10841911101489"}

Can anyone help please?

Many thanks,

Fiorano
Reply
#2
hard to say without knowing what's in list.txt, can you provide that?
Reply
#3
It seems as though its due to the encoding of my source txt file.

By default it is output from my source app as a UTF-8 file. If I open and save this as an ANSI text file the process works. Is there anything I can do at the start of my python app to amend the encoding?

Thanks for your help
Reply
#4
You read a file, which has a UTF8 encoding with Byte Order Mark: EF BB BF

Use as encoding utf-8-sig when you read the file.
Then the BOM is stripped away.

The prefix \uxxxx is just a representation for Unicode code points in Json and also for Python.
What you see, are the first three bytes, which defines the Byte Order. Usually this is not used.
I guess you must seek for documents, which are still using the Byte Order Mark.
My code examples are always for Python >=3.6.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#5
(Dec-02-2019, 03:56 PM)DeaD_EyE Wrote: You read a file, which has a UTF8 encoding with Byte Order Mark: EF BB BF

Use as encoding utf-8-sig when you read the file.
Then the BOM is stripped away.

The prefix \uxxxx is just a representation for Unicode code points in Json and also for Python.
What you see, are the first three bytes, which defines the Byte Order. Usually this is not used.
I guess you must seek for documents, which are still using the Byte Order Mark.

Perfect Thank You!!!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [SOLVED] Delete specific characters from string lines EnfantNicolas 4 197 Oct-21-2021, 11:28 AM
Last Post: EnfantNicolas
  Getting "name 'get_weather' is not defined error and no json_data returned? trthskr4 6 502 Sep-14-2021, 09:55 AM
Last Post: trthskr4
  Libraries installed with pipenv, but ModuleNotFoundError returned jpncsu 2 386 Sep-06-2021, 07:24 PM
Last Post: jpncsu
Question convert unlabeled list of tuples to json (string) masterAndreas 4 2,133 Apr-27-2021, 10:35 AM
Last Post: masterAndreas
  TypeError: __str__ returned non-string (type tuple) Anldra12 1 1,905 Apr-13-2021, 07:50 AM
Last Post: Anldra12
  Extract continuous numeric characters from a string in Python Robotguy 2 755 Jan-16-2021, 12:44 AM
Last Post: snippsat
  Convert string to JSON using a for loop PG_Breizh 3 909 Jan-08-2021, 06:10 PM
Last Post: PG_Breizh
  Python win32api keybd_event: How do I input a string of characters? JaneTan 3 1,032 Oct-19-2020, 04:16 AM
Last Post: deanhystad
  How to get first two characters in a string scratchmyhead 2 835 May-19-2020, 11:00 AM
Last Post: scratchmyhead
  TypeError: __repr__ returned non-string (type dict) shockwave 0 1,342 May-17-2020, 05:56 PM
Last Post: shockwave

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020