How to store columns of a .csv in variables (when the .csv has no headers) ?

hobbyist · (This post was last modified: Aug-17-2023, 08:52 PM by hobbyist.)

Hello! I have a .csv file and by using the following code:

my_dataset = pandas_package.read_csv("MyMeasurements.csv")
my_dataset["date"] = pandas_package.to_datetime(my_dataset["DATE"])

TimeStamp = my_dataset["date"]
sensor1_measurements = my_dataset["sensor1"]
sensor2_measurements = my_dataset["sensor2"]

And I get correctly each column of the .csv assigned to each variable. The problem is the following:

How can I get the same result if the .csv does not have headers ('TimeStamp ', 'sensor1_measurements ', 'sensor2_measurements ')?

I tried google searching and testing code but nothing worked... Any ideas???

**deanhystad** · (This post was last modified: Aug-18-2023, 12:56 AM by deanhystad.)

Start by reading the documentation for Pandas.read_csv(). I find it very useful.

https://pandas.pydata.org/docs/reference...d_csv.html

Where you find this:

Quote:names: array-like, optional
List of column names to use. If the file contains a header row, then you should explicitly pass header=0 to override the column names. Duplicates in this list are not allowed.

This assumes that you know the column names and the order of columns in the CSV.

hobbyist · Aug-18-2023, 08:25 AM

So, I cannot achieve what I want, right?

**deanhystad** · (This post was last modified: Aug-18-2023, 08:56 AM by deanhystad.)

That depends on what you want. You can read a CSV file without a header, but your column index will be numbers, 0, 1, 2... just as you get for the row index. To read a CSV file that has no header, again we return to the excellent Pandas documentation.

https://pandas.pydata.org/docs/reference...d_csv.html

This time it is not quite as excellent as before.

Quote:header: int, list of int, None, default ‘infer’
Row number(s) to use as the column names, and the start of the data. Default behavior is to infer the column names: if no names are passed the behavior is identical to header=0 and column names are inferred from the first line of the file, if column names are passed explicitly then the behavior is identical to header=None. Explicitly pass header=0 to be able to replace existing names. The header can be a list of integers that specify row locations for a multi-index on the columns e.g. [0,1,3]. Intervening rows that are not specified will be skipped (e.g. 2 in this example is skipped). Note that this parameter ignores commented lines and empty lines if skip_blank_lines=True, so header=0 denotes the first line of data rather than the first line of the file.

Our choices are:
int : Use this line in the file as the column header.
list of int: Use these lines in the file as the column header (for multi-index headers).
None: Not described very well what happens, but sounds like it will use "names" provided. What does it do if we don't provide names?
default: Infer column header names from the file.

So if you have a CSV file with no header, and you want Pandas to generate column numbers instead of inferring names from the file, set "header = None".

df = pd.read_csv(filename, header=None)

hobbyist · Aug-18-2023, 09:41 AM

Ok, I have read it! Thank you... I am experimenting now... However, what do I put here:

TimeStamp = ???
sensor1_measurements = ???
sensor2_measurements = ???

Thanks once again for your time...

***snippsat*** · (This post was last modified: Aug-18-2023, 01:50 PM by snippsat.)

(Aug-17-2023, 08:52 PM)hobbyist Wrote: And I get correctly each column of the .csv assigned to each variable. The problem is the following:

I think there are some confusion here,you shall not assigned to each variable in Pandas,when read from .csv it makes one DataFrame.
So your line 4-6 is not needed.
If i guess how your MyMeasurements.csv looks.

Output:DATE,sensor1,sensor2
2023-08-01 00:00:00,25.6,60.2
2023-08-01 01:00:00,25.7,61.5
2023-08-01 02:00:00,25.9,62.8
2023-08-01 03:00:00,26.2,63.2
2023-08-01 04:00:00,26.5,63.8

import pandas as pd

df = pd.read_csv('data.csv')
df["DATE"] = pd.to_datetime(df["DATE"])

This in now a working DataFrame,no need to assigned variables.
Line 4 is not assign a variable,it convert DATE to pandas datetime64 type.
Look at and do something with DataFrame.

>>> df
                 DATE  sensor1  sensor2
0 2023-08-01 00:00:00     25.6     60.2
1 2023-08-01 01:00:00     25.7     61.5
2 2023-08-01 02:00:00     25.9     62.8
3 2023-08-01 03:00:00     26.2     63.2
4 2023-08-01 04:00:00     26.5     63.8

# Look Types they are now correct
>>> df.dtypes
DATE       datetime64[ns]
sensor1           float64
sensor2           float64
dtype: object

# Or can use df.info()
>>> df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype         
---  ------   --------------  -----         
 0   DATE     5 non-null      datetime64[ns]
 1   sensor1  5 non-null      float64       
 2   sensor2  5 non-null      float64       
dtypes: datetime64[ns](1), float64(2)
memory usage: 248.0 bytes

# Max value of both sensors 
>>> max_values = df[['sensor1', 'sensor2']].max()
>>> max_values
sensor1    26.5
sensor2    63.8
dtype: float64

# To csv
>>> df.to_csv(index=False, line_terminator='\n')
('DATE,sensor1,sensor2\n'
 '2023-08-01 00:00:00,25.6,60.2\n'
 '2023-08-01 01:00:00,25.7,61.5\n'
 '2023-08-01 02:00:00,25.9,62.8\n'
 '2023-08-01 03:00:00,26.2,63.2\n'
 '2023-08-01 04:00:00,26.5,63.8\n')

**deanhystad** · Aug-18-2023, 02:06 PM

(Aug-18-2023, 09:41 AM)hobbyist Wrote: Ok, I have read it! Thank you... I am experimenting now... However, what do I put here:
TimeStamp = ???
sensor1_measurements = ???
sensor2_measurements = ???
Thanks once again for your time...

As I said in my first reply, if you know what columns are in the file you give them names:

df = pd.read_csv(filename, names=["date", "sensor1", "sensor2"])

If you don't know what the columns are the file is worthless, isn't it?

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	TreeView column headers	TWB	2	9,230	Jan-29-2023, 02:13 PM Last Post: TWB
	export sql table to csv using BCP with headers	mg24	0	1,300	Jan-19-2023, 05:36 AM Last Post: mg24
	Code changing rder of headers	Led_Zeppelin	0	1,546	Jul-13-2022, 05:38 PM Last Post: Led_Zeppelin
	making variables in my columns and rows in python	kronhamilton	2	2,556	Oct-31-2021, 10:38 AM Last Post: snippsat
	Request Headers (scheme)	JohnnyCoffee	0	2,521	Mar-31-2021, 09:17 PM Last Post: JohnnyCoffee
	Reading csv with multiple "headers"	Clives	3	3,855	Dec-31-2020, 09:25 AM Last Post: Ana_junior
	module to store functions/variables and how to call them?	mstichler	3	3,661	Jun-03-2020, 06:49 PM Last Post: mstichler
	I can't use file __init__ to store shared variables and classes in the package	AlekseyPython	2	4,297	Feb-04-2019, 06:26 AM Last Post: AlekseyPython
	Receive Serial Data and store in different Variables in Python	jenkins43	5	7,603	Dec-28-2018, 01:33 PM Last Post: snippsat

How to store columns of a .csv in variables (when the .csv has no headers) ?

User Panel Messages

Announcements