Python Forum
How to store columns of a .csv in variables (when the .csv has no headers) ?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to store columns of a .csv in variables (when the .csv has no headers) ?
#1
Hello! I have a .csv file and by using the following code:

my_dataset = pandas_package.read_csv("MyMeasurements.csv")
my_dataset["date"] = pandas_package.to_datetime(my_dataset["DATE"])

TimeStamp = my_dataset["date"]
sensor1_measurements = my_dataset["sensor1"]
sensor2_measurements = my_dataset["sensor2"]
And I get correctly each column of the .csv assigned to each variable. The problem is the following:

How can I get the same result if the .csv does not have headers ('TimeStamp ', 'sensor1_measurements ', 'sensor2_measurements ')?

I tried google searching and testing code but nothing worked... Any ideas???
Reply
#2
Start by reading the documentation for Pandas.read_csv(). I find it very useful.

https://pandas.pydata.org/docs/reference...d_csv.html

Where you find this:
Quote:names: array-like, optional
List of column names to use. If the file contains a header row, then you should explicitly pass header=0 to override the column names. Duplicates in this list are not allowed.

This assumes that you know the column names and the order of columns in the CSV.
hobbyist likes this post
Reply
#3
So, I cannot achieve what I want, right?
Reply
#4
That depends on what you want. You can read a CSV file without a header, but your column index will be numbers, 0, 1, 2... just as you get for the row index. To read a CSV file that has no header, again we return to the excellent Pandas documentation.

https://pandas.pydata.org/docs/reference...d_csv.html

This time it is not quite as excellent as before.

Quote:header: int, list of int, None, default ‘infer’
Row number(s) to use as the column names, and the start of the data. Default behavior is to infer the column names: if no names are passed the behavior is identical to header=0 and column names are inferred from the first line of the file, if column names are passed explicitly then the behavior is identical to header=None. Explicitly pass header=0 to be able to replace existing names. The header can be a list of integers that specify row locations for a multi-index on the columns e.g. [0,1,3]. Intervening rows that are not specified will be skipped (e.g. 2 in this example is skipped). Note that this parameter ignores commented lines and empty lines if skip_blank_lines=True, so header=0 denotes the first line of data rather than the first line of the file.

Our choices are:
int : Use this line in the file as the column header.
list of int: Use these lines in the file as the column header (for multi-index headers).
None: Not described very well what happens, but sounds like it will use "names" provided. What does it do if we don't provide names?
default: Infer column header names from the file.

So if you have a CSV file with no header, and you want Pandas to generate column numbers instead of inferring names from the file, set "header = None".
df = pd.read_csv(filename, header=None)
hobbyist likes this post
Reply
#5
Ok, I have read it! Thank you... I am experimenting now... However, what do I put here:

TimeStamp = ???
sensor1_measurements = ???
sensor2_measurements = ???
Thanks once again for your time...
Reply
#6
(Aug-17-2023, 08:52 PM)hobbyist Wrote: And I get correctly each column of the .csv assigned to each variable. The problem is the following:
I think there are some confusion here,you shall not assigned to each variable in Pandas,when read from .csv it makes one DataFrame.
So your line 4-6 is not needed.
If i guess how your MyMeasurements.csv looks.
Output:
DATE,sensor1,sensor2 2023-08-01 00:00:00,25.6,60.2 2023-08-01 01:00:00,25.7,61.5 2023-08-01 02:00:00,25.9,62.8 2023-08-01 03:00:00,26.2,63.2 2023-08-01 04:00:00,26.5,63.8
import pandas as pd

df = pd.read_csv('data.csv')
df["DATE"] = pd.to_datetime(df["DATE"])
This in now a working DataFrame,no need to assigned variables.
Line 4 is not assign a variable,it convert DATE to pandas datetime64 type.
Look at and do something with DataFrame.
>>> df
                 DATE  sensor1  sensor2
0 2023-08-01 00:00:00     25.6     60.2
1 2023-08-01 01:00:00     25.7     61.5
2 2023-08-01 02:00:00     25.9     62.8
3 2023-08-01 03:00:00     26.2     63.2
4 2023-08-01 04:00:00     26.5     63.8

# Look Types they are now correct
>>> df.dtypes
DATE       datetime64[ns]
sensor1           float64
sensor2           float64
dtype: object

# Or can use df.info()
>>> df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype         
---  ------   --------------  -----         
 0   DATE     5 non-null      datetime64[ns]
 1   sensor1  5 non-null      float64       
 2   sensor2  5 non-null      float64       
dtypes: datetime64[ns](1), float64(2)
memory usage: 248.0 bytes

# Max value of both sensors 
>>> max_values = df[['sensor1', 'sensor2']].max()
>>> max_values
sensor1    26.5
sensor2    63.8
dtype: float64

# To csv
>>> df.to_csv(index=False, line_terminator='\n')
('DATE,sensor1,sensor2\n'
 '2023-08-01 00:00:00,25.6,60.2\n'
 '2023-08-01 01:00:00,25.7,61.5\n'
 '2023-08-01 02:00:00,25.9,62.8\n'
 '2023-08-01 03:00:00,26.2,63.2\n'
 '2023-08-01 04:00:00,26.5,63.8\n')
hobbyist likes this post
Reply
#7
(Aug-18-2023, 09:41 AM)hobbyist Wrote: Ok, I have read it! Thank you... I am experimenting now... However, what do I put here:

TimeStamp = ???
sensor1_measurements = ???
sensor2_measurements = ???
Thanks once again for your time...
As I said in my first reply, if you know what columns are in the file you give them names:
df = pd.read_csv(filename, names=["date", "sensor1", "sensor2"])
If you don't know what the columns are the file is worthless, isn't it?
hobbyist likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  TreeView column headers TWB 2 2,218 Jan-29-2023, 02:13 PM
Last Post: TWB
  export sql table to csv using BCP with headers mg24 0 718 Jan-19-2023, 05:36 AM
Last Post: mg24
  Code changing rder of headers Led_Zeppelin 0 915 Jul-13-2022, 05:38 PM
Last Post: Led_Zeppelin
  making variables in my columns and rows in python kronhamilton 2 1,630 Oct-31-2021, 10:38 AM
Last Post: snippsat
  Request Headers (scheme) JohnnyCoffee 0 1,922 Mar-31-2021, 09:17 PM
Last Post: JohnnyCoffee
  Reading csv with multiple "headers" Clives 3 2,535 Dec-31-2020, 09:25 AM
Last Post: Ana_junior
  module to store functions/variables and how to call them? mstichler 3 2,419 Jun-03-2020, 06:49 PM
Last Post: mstichler
  I can't use file __init__ to store shared variables and classes in the package AlekseyPython 2 3,355 Feb-04-2019, 06:26 AM
Last Post: AlekseyPython
  Receive Serial Data and store in different Variables in Python jenkins43 5 5,629 Dec-28-2018, 01:33 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020