Python Forum
Panda.read_cvs Issues Reading Certain Columns
Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Panda.read_cvs Issues Reading Certain Columns
#1
Hey guys. I'm trying to get Panda.read_cvs to read certain columns in my dataset. It says that the 'high' column is not in the list...

I'm trying to get it to read only the 'open', 'high', 'low', 'close' columns.

Heres the code that I am having trouble with:

Fields = ['open', 'high', 'low', 'close']

# load dataset
dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=None, usecols=Fields)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:3]
Y = dataset[:,1]
Here is the error message:

/usr/bin/python2.7 /home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py --multiproc --qt-support=auto --client 127.0.0.1 --port 40167 --file /home/b/PycharmProjects/ANN1a/ANN2-Keras1a
pydev debugger: process 662 is connecting

Connected to pydev debugger (build 172.3968.37)
Using TensorFlow backend.
Traceback (most recent call last):
  File "/home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py", line 1599, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py", line 1026, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/b/PycharmProjects/ANN1a/ANN2-Keras1a", line 14, in <module>
    dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=None, usecols=Fields)
  File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 498, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 275, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 590, in __init__
    self._make_engine(self.engine)
  File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 731, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1138, in __init__
    col_indices.append(self.names.index(u))
ValueError: 'high' is not in list
Backend TkAgg is interactive backend. Turning interactive mode on.
Here is what the top of dataset looks like:

Date 		Open 	High 	Low     Close   Volume

16:00 		0.7477 	0.78 	0.746 	0.756 	805,154
10/25/2017 	0.742 	0.78 	0.735 	0.7585 	1,154,260
10/24/2017 	0.776 	0.792 	0.73 	0.745 	1,718,896
10/23/2017 	0.8067 	0.8067 	0.78 	0.78 	933,888
10/20/2017 	0.81 	0.8278 	0.77 	0.82 	1,339,279
10/19/2017 	0.8448 	0.864 	0.7863 	0.8178 	2,236,871
10/18/2017 	0.7701 	0.82 	0.7501 	0.81 	2,681,291
10/17/2017 	0.75 	0.759 	0.725 	0.7465 	929,243
10/16/2017 	0.7399 	0.7651 	0.72 	0.75 	1,539,805
10/13/2017 	0.75 	0.75 	0.69 	0.74 	2,958,883
Anyone see what I'm doing wrong? I'm stumped...
Reply
#2
Is it case sensitive?
If so, sholudn't is be 'High'?
Reply
#3
Connected to pydev debugger (build 172.3968.37)
/usr/bin/python2.7 /home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py --multiproc --qt-support=auto --client 127.0.0.1 --port 39687 --file /home/b/PycharmProjects/ANN1a/ANN2-Keras1a
pydev debugger: process 1956 is connecting

Using TensorFlow backend.
Traceback (most recent call last):
  File "/home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py", line 1599, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py", line 1026, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/b/PycharmProjects/ANN1a/ANN2-Keras1a", line 14, in <module>
    dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=None, usecols=Fields)
  File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 498, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 275, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 590, in __init__
    self._make_engine(self.engine)
  File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 731, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1138, in __init__
    col_indices.append(self.names.index(u))
ValueError: 'High' is not in list
Backend TkAgg is interactive backend. Turning interactive mode on.
I get the same error message. Is there another way to direct panda to the columns that I want?
Reply
#4
Take a look at: https://pandas.pydata.org/pandas-docs/st...exing.html
Reply
#5
Thanks for the response Larz. It took me a little while reading over all of that, but I figured it out. This is what I ended up with.

# load dataset
dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=None, usecols=[1,2,3,4])
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:3]
I guess the usecols argument only uses integers such as '1,2,3'. I guess that correlates to to the actual location of each column in your 'dataset.csv'.

Hey I'm gonna continue to have issues while I go through trying to build this project. Should I just keep creating separate threads for each one of my issues that pops up. That way I can keep posting my solution at the end for others to follow, or should I Keep it all in one thread? I just feel like I'm gonna have a lot of questions in the near future haha.

Thanks again Larz
Reply
#6
I would think new issue, new thread.
If you add it to an existing thread,  you may not get an answer because
it will look as though your thread is being answered already.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Adding a new column to a Panda Data Frame rsherry8 2 2,083 Jun-06-2021, 06:49 PM
Last Post: jefsummers
  Error checking and a call to read_cvs rsherry8 4 2,299 May-08-2021, 08:33 PM
Last Post: rsherry8
  How to filter data using a panda.DateFrame.loc pawlo392 1 2,631 May-27-2019, 08:47 PM
Last Post: michalmonday
  do you know a code that will print all correlation values using numpty and panda? crispybluewaffle88 1 2,423 Mar-06-2019, 12:45 PM
Last Post: scidam
  Make panda dataframe output pretty carstenlp 2 2,909 Jan-17-2019, 10:04 AM
Last Post: carstenlp
  Panda Dataframe Rounding Issue ab0217 5 7,201 Nov-06-2018, 10:15 PM
Last Post: ichabod801
  Replacing values for specific columns in Panda data structure Padowan 1 14,637 Nov-27-2017, 08:21 PM
Last Post: Padowan
  Panda Data Frame to Existing Multiple Sheets naveedraza 1 5,635 Jul-11-2017, 12:21 PM
Last Post: naveedraza

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020