Python Forum

Full Version: maping issue
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi, I am trying to convert a time date string into a time serial of 0 to 1 designating the hour of the day. I could not find anything existing to use and so tried to write a function and map it.

TSLA_1Sec is an array with the column 'Local TimeBid' being a series of strings:
11.02.2020 01:30:07.000 GMT+1100
11.02.2020 01:30:08.000 GMT+1100
11.02.2020 01:30:10.000 GMT+1100
11.02.2020 01:30:11.000 GMT+1100
11.02.2020 01:30:12.000 GMT+1100
11.02.2020 01:30:14.000 GMT+1100

My coding:
#convert timedate column into a 0-1 designation of time.
def TimeSerialFunc(DateStringStamp):
    DateStringStamp = DateStringStamp[11:-13]
    tHr = int(DateStringStamp[:2])/24
    tMin = int(DateStringStamp[3:5])/24/60
    tSec = int(DateStringStamp[6:])/24/60/60
    return  tHr + tMin + tSec

TSLA_1sec['Local TimeBID'].astype('str')
print(TSLA_1sec ['Local TimeBID'] )
TSLA_1sec ['Local TimeBID'] = map(TimeSerialFunc,TSLA_1sec['Local TimeBid'])
print( TSLA_1sec['Local TimeBid'])
My traceback error:
Quote:Exception has occurred: KeyError
'Local TimeBid'
File "/Users/jasonrae/Documents/Python Files/pandas/_libs/hashtable_class_helper.pxi", line 1626, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "/Users/jasonrae/Documents/Python Files/TSLA Try1.py", line 55, in <module>
TSLA_1sec ['Local TimeBID'] = map(TimeSerialFunc,TSLA_1sec['Local TimeBid'])

I tried to change 'Local TimeBID' series from a dtype object to string, thinking this was the problem with no success.
Dates in the dataframe could be parsed using pd.to_datetime. So, you don't need
to parse them yourself.

import pandas as pd
In [4]: df = pd.DataFrame({'a b': ["11.02.2020 01:30:07.000 GMT+1100",
   ...: "11.02.2020 01:30:08.000 GMT+1100",
   ...: "11.02.2020 01:30:10.000 GMT+1100"]})
In [5]: pd.to_datetime(df['a b']).apply(lambda x: x.hour/24 + x.minute/60 + x.second/3600)
Output:
Out[5]: 0 0.543611 1 0.543889 2 0.544444 Name: a b, dtype: float64
Everything wokrs fine; however, I wouldn't recommend using column names with spaces. At least, the reason is that you can access a column, as it is an df's attribute, e.g. df.column_name, instead of df['column_name']; note, this is possible, if there is no spaces in the column name. In the example above I couldn't access a column df.a b because of a space.
Thank you, I have updated my labels. Though I have just tried a bunch of different combinations. I can get your code to work if I manually create the array as you have, but applying it on the array as I have built it, creates the error
Error:
AttributeError: 'Series' object has no attribute 'second'
Quote:from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf

from tensorflow import keras
from keras import layers
from keras import regularizers

import numpy as np
import pandas as pd

TSLA_BIDS_loc = "/Users/jasonrae/Downloads/TSLA.USUSD_Candlestick_1_s_BID_11.02.2020-11.02.2020.csv"
TSLA_ASKS_loc = "/Users/jasonrae/Downloads/TSLA.USUSD_Candlestick_1_s_ASK_11.02.2020-11.02.2020.csv"

column_names_BIDS = ['Local_TimeBID', 'OpenBID', 'HighBID', 'LowBID', 'CloseBID', 'VolumeBID']
column_names_ASKS = ['Local_TimeASK', 'OpenASK', 'HighASK', 'LowASK', 'CloseASK', 'VolumeASK']
#raw_dataset = pd.read_csv(dataset_path, names=column_names)

dataread_bids = pd.read_csv(TSLA_BIDS_loc, names=column_names_BIDS)
dataread_asks = pd.read_csv(TSLA_ASKS_loc, names=column_names_ASKS)
print("Asks dataread: \n", dataread_asks)

dataread_bids = dataread_bids.drop([0], axis=0) # row 0
df = pd.DataFrame(dataread_bids['Local_TimeBID'])
print(df.shape)
df = pd.DataFrame(df['Local_TimeBID']).apply(lambda x: ((x.second/60 + x.minute)/60 + x.hour)/24)

print(df)
print(TSLA_1sec)
dataread_bids is already a dataframe. I don't understand, why do you pass its column to pd.DataFrame constructor, i.e. df = pd.DataFrame(dataread_bids['Local_TimeBID']). Probably, you want something like this:
dataread_bids['Local_TimeBID_converted'] = pd.to_datetime(dataread_bids['Local_TimeBID']).apply(lambda x: ((x.second/60 + x.minute)/60 + x.hour)/24)
Thank you so much! Mostly because I'm new and just simply did not notice! I was reading what I expected to see, not what was actually written!