CVS file to EXCEL - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: CVS file to EXCEL (/thread-21977.html) Pages:
1
2
|
CVS file to EXCEL - DavidTheGrockle - Oct-23-2019 I have a CVS file, published by the author of a scientific paper, having 700,000 rows. I want to split this file up into smaller "units" preferably putting the results into EXCEL. The limits of the "units" are set by the contents of column 9 of 10. Column 9 runs for roughly 1000 rows and then the content of (row,9) changes. I understand that pandas will do this and I understand the general way to go. BUT I got stuck on the detail : I cannot just figure out how to run down (row,9) until (row,9) changes from, say, E456 to BV789. Please point me to a good descriptive reference because I haven't, so far, been able to find one. RE: CVS file to EXCEL - Larz60+ - Oct-23-2019 This looks like a good start, without the pomp: https://towardsdatascience.com/quick-dive-into-pandas-for-data-science-cc1c1a80d9c4 RE: CVS file to EXCEL - DavidTheGrockle - Oct-23-2019 I think i can manage things now I have read this. Thank you RE: CVS file to EXCEL - ichabod801 - Oct-23-2019 I knew those CVS receipts were getting out of hand, but 700,000 rows? RE: CVS file to EXCEL - Larz60+ - Oct-23-2019 I still make that mistake from time to time. RE: CVS file to EXCEL - ichabod801 - Oct-23-2019 To be honest, so do I. RE: CVS file to EXCEL - scidam - Oct-24-2019 Do you want to group your data by values in a certain column and write the content of each group to a file. If so, your code might be like this one: import pandas as pd data = pd.read_csv('your big file.csv') for gr, d in data.groupby(data.iloc[:, 9]): d.to_csv('output_%s.csv' % gr) RE: CVS file to EXCEL - DavidTheGrockle - Oct-24-2019 "for gr, d in data.groupby(data.iloc[:, 9]):" - Please can you recommend a tutorial about the instructions for use with pandas ? 707,000 data rows. FYI these were the GPS locations of flights by a bird called the Manx Shearwater and gathered over a number of years by a group of people researching how the birds navigated. 700+ flight paths, GPS readings every five minutes. (!!) RE: CVS file to EXCEL - scidam - Oct-24-2019 (Oct-24-2019, 10:59 AM)DavidTheGrockle Wrote: "for gr, d in data.groupby(data.iloc[:, 9]):" - Please can you recommend a tutorial about the instructions for use with pandas ?Official documentation will be enough, I think. Pandas can read your data by chunks (however, I've never used groupby with chunked data). Let you have 1kb of data per row (it seems to be a reasonable assumption), so you have 700MB file. It should not be a problem to process this file at once if your computer have at least 8GB of memory. RE: CVS file to EXCEL - DavidTheGrockle - Oct-27-2019 I cannot get my CVS data into a variable what I have is import pandas as pd print("Hello, World!") # Read the data into a variable pacto url = "F:\Carrier Bag F\NAV Padget\Padget.csv" pacto = pd.read_csv(url)It seems to be unable to find the file. |