Python Forum
Read CSV data into Pandas DataSet From Variable?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Read CSV data into Pandas DataSet From Variable?
#1
 url = "http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
 the_names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
 dataset = pandas.read_csv(url, names=the_names)
Sure, the code above works with the standard Pandas "read_csv".

But, my issue is that I'm POSTing that csv data to a Flask service. The data comes in (as a variable) and I extract if from the Request dict, but I then can't seem to find a compatible method to load that data in a variable into the same pandas dataset.

I've tried read_clipboard, read_csv, read_table ... but they all error out.]

Do I need to do some kind of IO step?

Missing something easy, I'm sure, but did not see the answer online where eveyone seems to be reading the data from a disk file or from a URL directly.

Thanks in advance,
Reply
#2
After version pandas 0.19.2 --> it can read directly from url.
You can mess with io.StringIO before,but you should really upgrade.
If use Anaconda:
conda update conda
conda update anaconda
>>> import pandas as pd
>>> pd.__version__
'0.20.3'
G:\Anaconda3
λ python -m ptpython
>>> import pandas as pd
...
... url = "http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
... the_names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
... dataset = pd.read_csv(url, names=the_names)

>>> dataset
     sepal-length  sepal-width  petal-length  petal-width           class
0             5.1          3.5           1.4          0.2     Iris-setosa
1             4.9          3.0           1.4          0.2     Iris-setosa
2             4.7          3.2           1.3          0.2     Iris-setosa
3             4.6          3.1           1.5          0.2     Iris-setosa
4             5.0          3.6           1.4          0.2     Iris-setosa
5             5.4          3.9           1.7          0.4     Iris-setosa
6             4.6          3.4           1.4          0.3     Iris-setosa
Reply
#3
(Feb-26-2018, 12:27 AM)snippsat Wrote: After version pandas 0.19.2 --> it can read directly from url.
You can mess with io.StringIO before,but you should really upgrade.
If use Anaconda:
conda update conda
conda update anaconda
>>> import pandas as pd
>>> pd.__version__
'0.20.3'
G:\Anaconda3
λ python -m ptpython
>>> import pandas as pd
...
... url = "http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
... the_names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
... dataset = pd.read_csv(url, names=the_names)

>>> dataset
     sepal-length  sepal-width  petal-length  petal-width           class
0             5.1          3.5           1.4          0.2     Iris-setosa
1             4.9          3.0           1.4          0.2     Iris-setosa
2             4.7          3.2           1.3          0.2     Iris-setosa
3             4.6          3.1           1.5          0.2     Iris-setosa
4             5.0          3.6           1.4          0.2     Iris-setosa
5             5.4          3.9           1.7          0.4     Iris-setosa
6             4.6          3.4           1.4          0.3     Iris-setosa

Yes, as I said in my initial posting reading from a URL works fine.

That was not the problem I posted I am trying to solve. :)

My issue is that our application POSTs the data from another application to a Flask web service. I need a way to figure out how to get the POSTed data (in a variable) into the Pandas data set. From my original posting, I cannot find a compatible "read" method that can read a variable into a Pandas dataset.

So, how do you get CSV data, in a variable, (not in a URL, for example) into a Pandas dataset?

In the screenshot below, I tried to use the io.StringIO method, but that still throws 500 errors.

I also tried to just read in the data like pd.DataFrame(.....), but couldn't get the syntax correct.

Thanks,

Attached Files

Thumbnail(s)
   
Reply
#4
To get get whole file in form of POST you use Uploading Files method.
So need a form tag in html enctype=multipart/form-data.
Here a way that get whole .csv file but can also iterate over from server.
Sample data.
Output:
5.1,3.5,1.4,0.2,Iris-setosa 4.9,3.0,1.4,0.2,Iris-setosa 4.7,3.2,1.3,0.2,Iris-setosa 4.6,3.1,1.5,0.2,Iris-setosa
from flask import Flask, make_response, request
import io
import csv

app = Flask(__name__)
def transform(text_file_contents):
    return text_file_contents.replace("=", ",")

@app.route('/')
def form():
    return """
        <html>
            <body>
                <h1>Transform a file demo</h1>
                <form action="/transform" method="post" enctype="multipart/form-data">
                    <input type="file" name="data_file" />
                    <input type="submit" />
                </form>
            </body>
        </html>
    """

@app.route('/transform', methods=["POST"])
def transform_view():
    f = request.files['data_file']
    if not f:
        return "No file"
    stream = io.StringIO(f.stream.read().decode("UTF8"), newline=None)
    csv_input = csv.reader(stream)
    for row in csv_input:
        print(row)

    stream.seek(0)
    result = transform(stream.read())
    response = make_response(result)
    response.headers["Content-Disposition"] = "attachment; filename=result.csv"
    return response

if __name__ == "__main__":
    app.run(debug=True)
So get the whole file in and here is print on server.
Output:
E:\1py_div\div_code\flask λ python app.py * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) * Restarting with stat * Debugger is active! * Debugger PIN: 184-514-049 ['5.1', '3.5', '1.4', '0.2', 'Iris-setosa'] ['4.9', '3.0', '1.4', '0.2', 'Iris-setosa'] ['4.7', '3.2', '1.3', '0.2', 'Iris-setosa'] ['4.6', '3.1', '1.5', '0.2', 'Iris-setosa'] 127.0.0.1 - - [26/Feb/2018 13:20:50] "POST /transform HTTP/1.1" 200 -
Reply
#5
There must be a simple way to read csv "data" without writing an entire method like that. I'm a bit baffled why there isn't just a "pd.read(...type='csv',....)" method that will take "CSV", for example, as an argument, but work the same was as pd.read_csv().

This omission seems glaring to me, yet, again, I'm probably missing something.

thanks,
Reply
#6
(Feb-26-2018, 12:48 PM)Oliver Wrote: There must be a simple way to read csv "data" without writing an entire method like that.
Have to follow the HTTP protocol and how framework dealing with files over net.
As you explain you want to send data as one variable(i guess this mean all content of csv?),the easiest way is to deal with it like file object.
Try to recreate data from requests.vaules will be difficult.

You know that code over give the whole file uploaded result.csv
So then can open it local with pd.read_csv('result.csv').
If sending back to a view could use tablib,which make a html table.
Example:
[Image: nLslPU.png]
from flask import Flask, make_response, request
import io, os
import csv
import tablib

app = Flask(__name__)
def transform(text_file_contents):
    return text_file_contents.replace("=", ",")

@app.route('/')
def form():
    return """
        <html>
            <body>
                <h1>Transfer a file demo</h1>
                <form action="/transform" method="post" enctype="multipart/form-data">
                    <input type="file" name="data_file" />
                    <input type="submit" />
                </form>
                <br>
                <a href="/read_cvs">Read csv</a>                
            </body>
        </html>
    """    

@app.route('/transform', methods=["POST"])
def transform_view():
    f = request.files['data_file']
    if not f:
        return "No file"
    stream = io.StringIO(f.stream.read().decode("UTF8"), newline=None)
    '''
    csv_input = csv.reader(stream)
    for row in csv_input:
        print(row)'''
    stream.seek(0)
    result = transform(stream.read())
    response = make_response(result)
    response.headers["Content-Disposition"] = "attachment; filename=result.csv"   
    return response

@app.route('/read_cvs', methods=["GET"])
def read_csv():
    dataset = tablib.Dataset()
    with open(os.path.join(os.path.dirname(__file__),'C:/Users/Tom/Downloads/result.csv')) as f:
        dataset.csv = f.read()
    return dataset.html    

if __name__ == "__main__":
    app.run(debug=True)
 
Reply
#7
OK, I appreciate your help with this.

It really seems that instead of POSTing the data, the database should probably just export the CSV to a temporary disk path so the read_csv works easily.

Thanks again! :)
Reply
#8
Hi, I'm using Tornado web server and have a similar situation. There's not much to do actually.. the file's contents come as a bytestring. Assuming you've gotten the contents into a variable 'file1' as OP mentioned,

df = pd.read_csv( io.BytesIO(file1) )
should do the job. Do import io at the top of your code.

As to how I got that file's contents into a variable, here's a shortened snippet and I'm guessing the structure should be similar in your framework.

On the HTML side:
<p><input type="file" name="file1"></p>

Python side:
import pandas as pd
import io

# skipping the tornado specific code...

class hydGTFS(tornado.web.RequestHandler):
	def post(self):
		print( self.request.files['file1'][0]['filename'] )

		df = pd.read_csv( io.BytesIO( self.request.files['file1'][0]['body']) )
		print(df.head())
		self.write('ok got it bro')
Output:
Corridor 1 Week days detail..csv Run Id Run Description Trip Id Regulation Period Group Line Id 0 49 4901 6144 Default {new group} 47 1 49 4901 6144 Default {new group} 47 2 49 4901 6144 Default {new group} 47 3 49 4901 6144 Default {new group} 47 4 49 4901 6141 Default {new group} 37
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Grouping in pandas/multi-index data frame Aleqsie 3 607 Jan-06-2024, 03:55 PM
Last Post: deanhystad
Photo read matlab data pz16 1 1,322 Oct-06-2023, 11:00 PM
Last Post: snippsat
  Pandas read csv file in 'date/time' chunks MorganSamage 4 1,647 Feb-13-2023, 11:24 AM
Last Post: MorganSamage
Smile How to further boost the data read write speed using pandas tjk9501 1 1,229 Nov-14-2022, 01:46 PM
Last Post: jefsummers
Thumbs Up can't access data from URL in pandas/jupyter notebook aaanoushka 1 1,830 Feb-13-2022, 01:19 PM
Last Post: jefsummers
Question Sorting data with pandas TheZaind 4 2,295 Nov-22-2021, 07:33 PM
Last Post: aserian
  Pandas Data frame column condition check based on length of the value aditi06 1 2,655 Jul-28-2021, 11:08 AM
Last Post: jefsummers
  [Pandas] Write data to Excel with dot decimals manonB 1 5,774 May-05-2021, 05:28 PM
Last Post: ibreeden
  pandas.to_datetime: Combine data from 2 columns ju21878436312 1 2,420 Feb-20-2021, 08:25 PM
Last Post: perfringo
  Dropping Rows From A Data Frame Based On A Variable JoeDainton123 1 2,186 Aug-03-2020, 02:05 AM
Last Post: scidam

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020