Oct-30-2023, 02:25 PM
English is not my native language; please excuse my typing errors.
Thus, this is my first post. I hope I will get more precise on my questions soon.
I am manipulating Excel sheets using pandas, and the problem is that I noticed that my code changes the data types of the data frames by itself. I specified the reading format of the Excel sheet with:
Surprisingly, when I use print(line_arr.dtypes) the result is
Thus, this is my first post. I hope I will get more precise on my questions soon.
I am manipulating Excel sheets using pandas, and the problem is that I noticed that my code changes the data types of the data frames by itself. I specified the reading format of the Excel sheet with:
import pandas as pd form = {"DayofMonth": float, "Tail_Number": str, "CRSDepTime": float, "DepDelay": float, "ArrTime": float} arr = pd.read_excel(ad_arr, dtype = form, engine='openpyxl', usecols=['DayofMonth', 'Tail_Number', 'CRSDepTime', 'DepDelay', 'ArrTime'])So when I use print(arr.dtypes) the result is
Output:DayofMonth float64
Tail_Number object
CRSDepTime float64
DepDelay float64
ArrTime float64
ID object
dtype: object
However, the code uses a command similar to this one:line_arr = arr.loc[0].to_frame().T
Surprisingly, when I use print(line_arr.dtypes) the result is
Output:DayofMonth object
Tail_Number object
CRSDepTime object
DepDelay object
ArrTime object
ID object
dtype: object
This causes trouble later when I compare data frames to check if they are identical. One solution that I tried was forcing the data types back to what it should be, with:dtypes = tot_tail.dtypes line_arr = line_arr.astype(dtypes)Is there any defensive programming strategy I could use to avoid this problem? I don't want to force data types back over and over again.