Python Forum

Aim: I would like to convert data, read with read_csv and convert it to a dataframe.
What I've tried: 1. data = pd.read_csv(...) 2. pd.DataFrame(data)
Problem: The columns are not shown in the dataframe as expected in 2 columns.

import pandas as pd
import numpy as np

from datetime import datetime, timedelta

# read data, date_parser=[0]: first column to datetime, 
data = pd.read_csv('minimal_data.csv', delimiter = ';', date_parser=[0], usecols=[0, 1], header = 0, names = ["MyColumn1","MyColumn2"]), 
   
print(data)    
    
df = pd.DataFrame(data)

print(df)

---
MacOS 10.15.7, Jupyter notebook

Hello,

I.m not an expert with pandas, however:

data is already a dataframe.
drop lines 10 - 13 and all will be fine.

import pandas as pd
import numpy as np
 
from datetime import datetime, timedelta
 
# read data, date_parser=[0]: first column to datetime, 
data = pd.read_csv('minimal_data.csv', delimiter = ';', date_parser=[0], usecols=[0, 1], header = 0, names = ["MyColumn1","MyColumn2"]), 
    
print(data)

Output:(             MyColumn1  MyColumn2
0  09.06.2021 14:35:05        100
1  09.06.2021 14:36:16        100
2  09.06.2021 14:37:26        100
3  09.06.2021 14:38:37        100
4  09.06.2021 14:39:48        100
5  09.06.2021 14:40:59        100
6  09.06.2021 14:42:10        100
7  09.06.2021 14:43:21        100
8  09.06.2021 14:44:32        100,)

Thank you. I actually need to further evaluate the data. And if I try it directly with data, I get:

data.dtypes

AttributeError: 'tuple' object has no attribute 'dtypes'

data.loc[data['MyColumn2'] == 0]

AttributeError: 'tuple' object has no attribute 'iloc'

(Jun-15-2021, 01:26 PM)ju21878436312 Wrote: [ -> ]hank you. I actually need to further evaluate the data. And if I try it directly with data, I get:

You need to get thee DataFrame out of tuple.
Here a example with some advice.

import pandas as pd
import numpy as np

# Pandas has own datateime do not need to use this
#from datetime import datetime, timedelta

# read data, date_parser=[0]: first column to datetime,
data = pd.read_csv('minimal_data.csv', delimiter = ';', date_parser=[0], usecols=[0, 1], header=0, names=["MyColumn1","MyColumn2"]),

# Get DataFrame out of tupe
df = data[0]
# Convert to datetime64
df['MyColumn1'] = pd.to_datetime(df['MyColumn1'])
print(df.dtypes)
print(df)
print('-' * 30)
print(df.loc[df['MyColumn2'] == 0])

Output:MyColumn1    datetime64[ns]
MyColumn2             int64
dtype: object
            MyColumn1  MyColumn2
0 2021-09-06 14:35:05        178
1 2021-09-06 14:36:16         59
2 2021-09-06 14:37:26          0
3 2021-09-06 14:38:37          0
4 2021-09-06 14:39:48          0
5 2021-09-06 14:40:59          0
6 2021-09-06 14:42:10          0
7 2021-09-06 14:43:21          0
8 2021-09-06 14:44:32          0
------------------------------
            MyColumn1  MyColumn2
2 2021-09-06 14:37:26          0
3 2021-09-06 14:38:37          0
4 2021-09-06 14:39:48          0
5 2021-09-06 14:40:59          0
6 2021-09-06 14:42:10          0
7 2021-09-06 14:43:21          0
8 2021-09-06 14:44:32          0

@snippsat: Thank you very much for the useful comments! Dance

ju21878436312

Larz60+

ju21878436312

snippsat

ju21878436312