Python Forum

Full Version: Python .json problem with UTF-8 file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
When my program tries to deserialize a .json file, it chokes on the UTF-8 designation (EF BB BF) at the beginning of the .json file. 
The error is: "No JSON object could be decoded"

Is there a way to ignore those three characters while reading .json files short of doing a binary read to strip them out?
Is the json file valid JSON Formatter.
Yes it is. When you copy the text and put it tn the verifier the special characters do not show up. They are all unprintable and you can't copy them wit a text editor. It is only when you open the json with a hex editor that you can see them. Strip them out and json deserializes the file just fine.
Are you using python 2.7? Python 3 handles utf-8 naturally but for python 2.7 you have to put #-*- coding: utf-8 -*- under the shebang

#!/usr/bin/env python
# -*- coding: utf-8 -*-
    .....code here.....
The file is saved with UTF-8 Byte Order Mark (BOM).
You can try with utf-8-sig
import json
import codecs

json.load(codecs.open('sample.json', 'r', 'utf-8-sig'))
This one was actually resolved by the person who created the json file. He removed the UTF-8 BOM at the front of the file. However, the solutions above have been noted for the future. Thank you !