Python Forum
Methods for Interegating output from an API
Thread Rating:
  • 1 Vote(s) - 3 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Methods for Interegating output from an API
#1
Hi,

I am trying to extract data values from an Rest API response. 

Firstly, could someone please clarify this as well as any corrections of the terminology for records/elements etc.. I have saved the API output to a text file, so I can ensure the data is consistant whilst developing the client side of the process. Will the data be in the same "format" as it would be in response to an API, by doing this? in the sense that I want to use the same method of processing the output from a text file as I would if I was to process the output from the API? Or to put it another way, can I use the same code to process the data that I read in from a text file as I would do to process the output from the API.

The text file contains one string of data, that consists of 1440 "individual records" of data (this consists of the last 24 hours prices at one minute intervals), each containing a number of "data names and values". Below is the first of the 1440 records, as an example.

Output:
{ "instrument" : "GBP_JPY", "granularity" : "M1", "candles" : [ { "time" : "2017-03-29T14:10:00.000000Z", "openBid" : 137.8, "openAsk" : 137.83, "highBid" : 137.903, "highAsk" : 137.933, "lowBid" : 137.8, "lowAsk" : 137.83, "closeBid" : 137.878, "closeAsk" : 137.906, "volume" : 499, "complete" : true },
Secondly, is someone able to outline the most suitable methods, a pseudocode summary type reponse would be sufficient, (I'm thinking a tuple might work well here, would Data Frames still work well for this type of output?) with relevent libraries (I am thinking of JSON and possibly PANDAS)  to:

a) loop through the output and extract specific data values to calculate ranges of values based on: 1 minute values for the last 5 minutes, 5 minutes values for the last 5x5minute timeframes, 15m values for the last 5x15minute timeframes etc. for averages etc.

b) do the same sort of processing on the data, but without looping through, selecting specific records, e.g. record 1 through 5, record 1,5,10,15. I appreciate that this could be done via a loop setting steps of say 5, but the reality is that I want to select much specific records through the 24 hour timeframe and plan to also retrive the data in other smaller timeframes, so accessing more specific records is required.

In essence, I would like to be able to treat each of the 1440 time frames of data as a record and be able to use these records by direcly acessing specific records e.g. records 5,15,25 etc as well as via looping through sections of the complete string, e.g. records 1-5,1-25,1-50 etc.

I have attached a zip (42k zipped due to exceeding origial size limit)  of the complete text file, that i am using. 

.zip   Candle.zip (Size: 41.84 KB / Downloads: 53)
Any help or pointers would be very much appreciated.

Thank you.

Bass
Reply
#2
I think that pandas could work for you. Something like
import pandas as pd, zipfile, json

with zipfile.ZipFile('Candle.zip') as fobj:
    data = json.loads(fobj.read('Candle.txt').decode('utf8'))
df = pd.DataFrame(data['candles'])
df.time = pd.to_datetime(df.time)
df = df.set_index('time', drop=True)
Output:
In [8]: df.head(2) Out[8]:                      closeAsk  closeBid complete  highAsk  highBid  lowAsk  \ time                                                                          2017-03-29 14:10:00   137.906   137.878     True  137.933  137.903  137.83    2017-03-29 14:11:00   137.920   137.895     True  137.927  137.903  137.88                          lowBid  openAsk  openBid  volume   time                                                     2017-03-29 14:10:00  137.800  137.830  137.800     499   2017-03-29 14:11:00  137.854  137.906  137.879     289
will create a dataframe indexed by datetimes and pandas date/time based indexing is quite powerful.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020