Python Forum
Read CSV data into Pandas DataSet From Variable?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Read CSV data into Pandas DataSet From Variable?
#1
 url = "http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
 the_names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
 dataset = pandas.read_csv(url, names=the_names)
Sure, the code above works with the standard Pandas "read_csv".

But, my issue is that I'm POSTing that csv data to a Flask service. The data comes in (as a variable) and I extract if from the Request dict, but I then can't seem to find a compatible method to load that data in a variable into the same pandas dataset.

I've tried read_clipboard, read_csv, read_table ... but they all error out.]

Do I need to do some kind of IO step?

Missing something easy, I'm sure, but did not see the answer online where eveyone seems to be reading the data from a disk file or from a URL directly.

Thanks in advance,
Reply
#2
After version pandas 0.19.2 --> it can read directly from url.
You can mess with io.StringIO before,but you should really upgrade.
If use Anaconda:
conda update conda
conda update anaconda
>>> import pandas as pd
>>> pd.__version__
'0.20.3'
G:\Anaconda3
λ python -m ptpython
>>> import pandas as pd
...
... url = "http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
... the_names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
... dataset = pd.read_csv(url, names=the_names)

>>> dataset
     sepal-length  sepal-width  petal-length  petal-width           class
0             5.1          3.5           1.4          0.2     Iris-setosa
1             4.9          3.0           1.4          0.2     Iris-setosa
2             4.7          3.2           1.3          0.2     Iris-setosa
3             4.6          3.1           1.5          0.2     Iris-setosa
4             5.0          3.6           1.4          0.2     Iris-setosa
5             5.4          3.9           1.7          0.4     Iris-setosa
6             4.6          3.4           1.4          0.3     Iris-setosa
Reply
#3
(Feb-26-2018, 12:27 AM)snippsat Wrote: After version pandas 0.19.2 --> it can read directly from url.
You can mess with io.StringIO before,but you should really upgrade.
If use Anaconda:
conda update conda
conda update anaconda
>>> import pandas as pd
>>> pd.__version__
'0.20.3'
G:\Anaconda3
λ python -m ptpython
>>> import pandas as pd
...
... url = "http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
... the_names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
... dataset = pd.read_csv(url, names=the_names)

>>> dataset
     sepal-length  sepal-width  petal-length  petal-width           class
0             5.1          3.5           1.4          0.2     Iris-setosa
1             4.9          3.0           1.4          0.2     Iris-setosa
2             4.7          3.2           1.3          0.2     Iris-setosa
3             4.6          3.1           1.5          0.2     Iris-setosa
4             5.0          3.6           1.4          0.2     Iris-setosa
5             5.4          3.9           1.7          0.4     Iris-setosa
6             4.6          3.4           1.4          0.3     Iris-setosa

Yes, as I said in my initial posting reading from a URL works fine.

That was not the problem I posted I am trying to solve. :)

My issue is that our application POSTs the data from another application to a Flask web service. I need a way to figure out how to get the POSTed data (in a variable) into the Pandas data set. From my original posting, I cannot find a compatible "read" method that can read a variable into a Pandas dataset.

So, how do you get CSV data, in a variable, (not in a URL, for example) into a Pandas dataset?

In the screenshot below, I tried to use the io.StringIO method, but that still throws 500 errors.

I also tried to just read in the data like pd.DataFrame(.....), but couldn't get the syntax correct.

Thanks,

Attached Files

Thumbnail(s)
   
Reply
#4
To get get whole file in form of POST you use Uploading Files method.
So need a form tag in html enctype=multipart/form-data.
Here a way that get whole .csv file but can also iterate over from server.
Sample data.
Output:
5.1,3.5,1.4,0.2,Iris-setosa 4.9,3.0,1.4,0.2,Iris-setosa 4.7,3.2,1.3,0.2,Iris-setosa 4.6,3.1,1.5,0.2,Iris-setosa
from flask import Flask, make_response, request
import io
import csv

app = Flask(__name__)
def transform(text_file_contents):
    return text_file_contents.replace("=", ",")

@app.route('/')
def form():
    return """
        <html>
            <body>
                <h1>Transform a file demo</h1>
                <form action="/transform" method="post" enctype="multipart/form-data">
                    <input type="file" name="data_file" />
                    <input type="submit" />
                </form>
            </body>
        </html>
    """

@app.route('/transform', methods=["POST"])
def transform_view():
    f = request.files['data_file']
    if not f:
        return "No file"
    stream = io.StringIO(f.stream.read().decode("UTF8"), newline=None)
    csv_input = csv.reader(stream)
    for row in csv_input:
        print(row)

    stream.seek(0)
    result = transform(stream.read())
    response = make_response(result)
    response.headers["Content-Disposition"] = "attachment; filename=result.csv"
    return response

if __name__ == "__main__":
    app.run(debug=True)
So get the whole file in and here is print on server.
Output:
E:\1py_div\div_code\flask λ python app.py * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) * Restarting with stat * Debugger is active! * Debugger PIN: 184-514-049 ['5.1', '3.5', '1.4', '0.2', 'Iris-setosa'] ['4.9', '3.0', '1.4', '0.2', 'Iris-setosa'] ['4.7', '3.2', '1.3', '0.2', 'Iris-setosa'] ['4.6', '3.1', '1.5', '0.2', 'Iris-setosa'] 127.0.0.1 - - [26/Feb/2018 13:20:50] "POST /transform HTTP/1.1" 200 -
Reply
#5
There must be a simple way to read csv "data" without writing an entire method like that. I'm a bit baffled why there isn't just a "pd.read(...type='csv',....)" method that will take "CSV", for example, as an argument, but work the same was as pd.read_csv().

This omission seems glaring to me, yet, again, I'm probably missing something.

thanks,
Reply
#6
(Feb-26-2018, 12:48 PM)Oliver Wrote: There must be a simple way to read csv "data" without writing an entire method like that.
Have to follow the HTTP protocol and how framework dealing with files over net.
As you explain you want to send data as one variable(i guess this mean all content of csv?),the easiest way is to deal with it like file object.
Try to recreate data from requests.vaules will be difficult.

You know that code over give the whole file uploaded result.csv
So then can open it local with pd.read_csv('result.csv').
If sending back to a view could use tablib,which make a html table.
Example:
[Image: nLslPU.png]
from flask import Flask, make_response, request
import io, os
import csv
import tablib

app = Flask(__name__)
def transform(text_file_contents):
    return text_file_contents.replace("=", ",")

@app.route('/')
def form():
    return """
        <html>
            <body>
                <h1>Transfer a file demo</h1>
                <form action="/transform" method="post" enctype="multipart/form-data">
                    <input type="file" name="data_file" />
                    <input type="submit" />
                </form>
                <br>
                <a href="/read_cvs">Read csv</a>                
            </body>
        </html>
    """    

@app.route('/transform', methods=["POST"])
def transform_view():
    f = request.files['data_file']
    if not f:
        return "No file"
    stream = io.StringIO(f.stream.read().decode("UTF8"), newline=None)
    '''
    csv_input = csv.reader(stream)
    for row in csv_input:
        print(row)'''
    stream.seek(0)
    result = transform(stream.read())
    response = make_response(result)
    response.headers["Content-Disposition"] = "attachment; filename=result.csv"   
    return response

@app.route('/read_cvs', methods=["GET"])
def read_csv():
    dataset = tablib.Dataset()
    with open(os.path.join(os.path.dirname(__file__),'C:/Users/Tom/Downloads/result.csv')) as f:
        dataset.csv = f.read()
    return dataset.html    

if __name__ == "__main__":
    app.run(debug=True)
 
Reply
#7
OK, I appreciate your help with this.

It really seems that instead of POSTing the data, the database should probably just export the CSV to a temporary disk path so the read_csv works easily.

Thanks again! :)
Reply
#8
Hi, I'm using Tornado web server and have a similar situation. There's not much to do actually.. the file's contents come as a bytestring. Assuming you've gotten the contents into a variable 'file1' as OP mentioned,

df = pd.read_csv( io.BytesIO(file1) )
should do the job. Do import io at the top of your code.

As to how I got that file's contents into a variable, here's a shortened snippet and I'm guessing the structure should be similar in your framework.

On the HTML side:
<p><input type="file" name="file1"></p>

Python side:
import pandas as pd
import io

# skipping the tornado specific code...

class hydGTFS(tornado.web.RequestHandler):
	def post(self):
		print( self.request.files['file1'][0]['filename'] )

		df = pd.read_csv( io.BytesIO( self.request.files['file1'][0]['body']) )
		print(df.head())
		self.write('ok got it bro')

Output:
Corridor 1 Week days detail..csv Run Id Run Description Trip Id Regulation Period Group Line Id 0 49 4901 6144 Default {new group} 47 1 49 4901 6144 Default {new group} 47 2 49 4901 6144 Default {new group} 47 3 49 4901 6144 Default {new group} 47 4 49 4901 6141 Default {new group} 37
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Dropping Rows From A Data Frame Based On A Variable JoeDainton123 1 284 Aug-03-2020, 02:05 AM
Last Post: scidam
  Using Autoencoder for Data Augmentation of numerical Dataset in Python Marvin93 2 426 Jul-10-2020, 07:18 PM
Last Post: Marvin93
  pandas read_csv can't handle missing data mrdominikku 0 357 Jul-09-2020, 12:26 PM
Last Post: mrdominikku
  Pandas data frame creation from Kafka Topic vboppa 0 218 Jul-01-2020, 04:23 PM
Last Post: vboppa
  Generate Test data (.csv) using Pandas Ashley 5 487 Jun-15-2020, 02:51 PM
Last Post: jefsummers
  Can't read text file with pandas zinho 6 3,672 May-24-2020, 06:13 AM
Last Post: azajali43
  Read json array data by pandas vipinct 0 360 Apr-13-2020, 02:24 PM
Last Post: vipinct
  add formatted column to pandas data frame alkaline3 0 355 Mar-22-2020, 06:44 PM
Last Post: alkaline3
  Python read Excel cell data validation anantpatil 0 752 Jan-31-2020, 04:57 PM
Last Post: anantpatil
  getting trailing zeros with 1 during pandas read fullstop 1 994 Jan-05-2020, 04:01 PM
Last Post: ichabod801

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020