Python Forum
CVS file to EXCEL - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: CVS file to EXCEL (/thread-21977.html)

Pages: 1 2


CVS file to EXCEL - DavidTheGrockle - Oct-23-2019

I have a CVS file, published by the author of a scientific paper, having 700,000 rows. I want to split this file up into smaller "units" preferably putting the results into EXCEL.

The limits of the "units" are set by the contents of column 9 of 10. Column 9 runs for roughly 1000 rows and then the content of (row,9) changes.

I understand that pandas will do this and I understand the general way to go.

BUT I got stuck on the detail : I cannot just figure out how to run down (row,9) until (row,9) changes from, say, E456 to BV789.

Please point me to a good descriptive reference because I haven't, so far, been able to find one.


RE: CVS file to EXCEL - Larz60+ - Oct-23-2019

This looks like a good start, without the pomp: https://towardsdatascience.com/quick-dive-into-pandas-for-data-science-cc1c1a80d9c4


RE: CVS file to EXCEL - DavidTheGrockle - Oct-23-2019

I think i can manage things now I have read this. Thank you


RE: CVS file to EXCEL - ichabod801 - Oct-23-2019

I knew those CVS receipts were getting out of hand, but 700,000 rows? Wink


RE: CVS file to EXCEL - Larz60+ - Oct-23-2019

I still make that mistake from time to time.


RE: CVS file to EXCEL - ichabod801 - Oct-23-2019

To be honest, so do I.


RE: CVS file to EXCEL - scidam - Oct-24-2019

Do you want to group your data by values in a certain column and write the
content of each group to a file. If so, your code might be like this one:

import pandas as pd
data = pd.read_csv('your big file.csv')

for gr, d in data.groupby(data.iloc[:, 9]):
    d.to_csv('output_%s.csv' % gr)



RE: CVS file to EXCEL - DavidTheGrockle - Oct-24-2019

"for gr, d in data.groupby(data.iloc[:, 9]):" - Please can you recommend a tutorial about the instructions for use with pandas ?

707,000 data rows. FYI these were the GPS locations of flights by a bird called the Manx Shearwater and gathered over a number of years by a group of people researching how the birds navigated. 700+ flight paths, GPS readings every five minutes. (!!)


RE: CVS file to EXCEL - scidam - Oct-24-2019

(Oct-24-2019, 10:59 AM)DavidTheGrockle Wrote: "for gr, d in data.groupby(data.iloc[:, 9]):" - Please can you recommend a tutorial about the instructions for use with pandas ?
Official documentation will be enough, I think. Pandas can read your data by chunks (however, I've never used groupby with chunked data). Let you have 1kb of data per row (it seems to be a reasonable assumption), so you have 700MB file. It should not be a problem to process this file at once if your computer have at least 8GB of memory.


RE: CVS file to EXCEL - DavidTheGrockle - Oct-27-2019

I cannot get my CVS data into a variable what I have is
import pandas as pd
print("Hello, World!")
# Read the data into a variable pacto
url = "F:\Carrier Bag F\NAV Padget\Padget.csv"
pacto = pd.read_csv(url)
It seems to be unable to find the file.