Python Forum

Full Version: CVS file to EXCEL
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
I have a CVS file, published by the author of a scientific paper, having 700,000 rows. I want to split this file up into smaller "units" preferably putting the results into EXCEL.

The limits of the "units" are set by the contents of column 9 of 10. Column 9 runs for roughly 1000 rows and then the content of (row,9) changes.

I understand that pandas will do this and I understand the general way to go.

BUT I got stuck on the detail : I cannot just figure out how to run down (row,9) until (row,9) changes from, say, E456 to BV789.

Please point me to a good descriptive reference because I haven't, so far, been able to find one.
This looks like a good start, without the pomp: https://towardsdatascience.com/quick-div...1c1a80d9c4
I think i can manage things now I have read this. Thank you
I knew those CVS receipts were getting out of hand, but 700,000 rows? Wink
I still make that mistake from time to time.
To be honest, so do I.
Do you want to group your data by values in a certain column and write the
content of each group to a file. If so, your code might be like this one:

import pandas as pd
data = pd.read_csv('your big file.csv')

for gr, d in data.groupby(data.iloc[:, 9]):
    d.to_csv('output_%s.csv' % gr)
"for gr, d in data.groupby(data.iloc[:, 9]):" - Please can you recommend a tutorial about the instructions for use with pandas ?

707,000 data rows. FYI these were the GPS locations of flights by a bird called the Manx Shearwater and gathered over a number of years by a group of people researching how the birds navigated. 700+ flight paths, GPS readings every five minutes. (!!)
(Oct-24-2019, 10:59 AM)DavidTheGrockle Wrote: [ -> ]"for gr, d in data.groupby(data.iloc[:, 9]):" - Please can you recommend a tutorial about the instructions for use with pandas ?
Official documentation will be enough, I think. Pandas can read your data by chunks (however, I've never used groupby with chunked data). Let you have 1kb of data per row (it seems to be a reasonable assumption), so you have 700MB file. It should not be a problem to process this file at once if your computer have at least 8GB of memory.
I cannot get my CVS data into a variable what I have is
import pandas as pd
print("Hello, World!")
# Read the data into a variable pacto
url = "F:\Carrier Bag F\NAV Padget\Padget.csv"
pacto = pd.read_csv(url)
It seems to be unable to find the file.
Pages: 1 2