Posts: 9
Threads: 3
Joined: Apr 2021
Apr-04-2021, 08:17 PM
(This post was last modified: Apr-04-2021, 08:22 PM by johnboy1974.)
Hi,
I'm trying to convert a response which I think is in Unicode? I am using the requests library to get data from an API. I am running the following:
import requests
url = "https://api-football-v1.p.rapidapi.com/v3/standings"
querystring = {"season":"2020","league":"187"}
headers = {
'x-rapidapi-key': "asdfasdfasdfasdfasdf",
'x-rapidapi-host': "api-football-v1.p.rapidapi.com"
}
response = requests.request("GET", url, headers=headers, params=querystring)
strResponse = response.text
utf8string = strResponse.encode("utf-8")
print(utf8string) It doesn't work though. In that response, I have the likes of:
{"id":10735,"name":" CR B\\u00e9ni Thour"}
But that should come back as:
{"id":10735,"name":" CR Béni Thour"}
Not sure what I'm doing wrong?
Many thanks!
J
Posts: 7,320
Threads: 123
Joined: Sep 2016
Apr-04-2021, 08:58 PM
(This post was last modified: Apr-04-2021, 08:59 PM by snippsat.)
Usually so dos API give back json ,so should not use text .
response = requests.request("GET", url, headers=headers, params=querystring)
json_data = response.json()
print(json_data) So what you should get back here is Python dictionary,as Requests has build in encoding and decoding for json .
johnboy1974 likes this post
Posts: 56
Threads: 2
Joined: Jan 2021
(Apr-04-2021, 08:17 PM)johnboy1974 Wrote: Hi,
I'm trying to convert a response which I think is in Unicode? I am using the requests library to get data from an API. I am running the following:
import requests
url = "https://api-football-v1.p.rapidapi.com/v3/standings"
querystring = {"season":"2020","league":"187"}
headers = {
'x-rapidapi-key': "asdfasdfasdfasdfasdf",
'x-rapidapi-host': "api-football-v1.p.rapidapi.com"
}
response = requests.request("GET", url, headers=headers, params=querystring)
strResponse = response.text
utf8string = strResponse.encode("utf-8")
print(utf8string) It doesn't work though. In that response, I have the likes of:
{"id":10735,"name":"CR B\\u00e9ni Thour"}
But that should come back as:
{"id":10735,"name":"CR Béni Thour"}
Not sure what I'm doing wrong?
Many thanks!
J
It is not stored as you want it; it is stored in a version of UTF-8 encoding that uses \u to specify the character. It is up to you to provide a way to translate the \u codes to the actual Unicode characters. To do this, you have to make sure that Unicode is supported in the internal character set of strings. The external rendering requires that the encoding be supported, or that you convert it yourself. So you have to replace the characters \u00e9 with the character whose code is Unicode 00e9. Now, you can make a special case, because \u00XX is equivalent to XX, so even if your strings are only 8-bit strings, you could replace the sequece \u00XX with the character whose code is XX, but that only works for the special case of \u00, and is not a general scheme.
Posts: 9
Threads: 3
Joined: Apr 2021
(Apr-04-2021, 08:58 PM)snippsat Wrote: Usually so dos API give back json ,so should not use text .
response = requests.request("GET", url, headers=headers, params=querystring)
json_data = response.json()
print(json_data) So what you should get back here is Python dictionary,as Requests has build in encoding and decoding for json .
That worked a treat, mate!! Thanks so much. I'm clearly very bad at this :)
Cheers,
JOhn
Posts: 7,320
Threads: 123
Joined: Sep 2016
Apr-05-2021, 10:10 AM
(This post was last modified: Apr-05-2021, 10:10 AM by snippsat.)
(Apr-05-2021, 05:58 AM)supuflounder Wrote: It is not stored as you want it; it is stored in a version of UTF-8 encoding that uses \u to specify the character. It is up to you to provide a way to translate the \u codes to the actual Unicode characters. To do this, you have to make sure that Unicode is supported in the internal character set of strings. The external rendering requires that the encoding be supported, or that you convert it yourself. So you have to replace the characters \u00e9 with the character whose code is Unicode 00e9. Now, you can make a special case, because \u00XX is equivalent to XX, so even if your strings are only 8-bit strings, you could replace the sequece \u00XX with the character whose code is XX, but that only works for the special case of \u00, and is not a general scheme. The problem here is that did not use response.json() .
response.text is the wrong way here and with that can get all kind of Unicode encoding problems
|