Bottom Page

Thread Rating:
  • 1 Vote(s) - 4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Reformat csv data with Python
Hello. I am a newbie to python taking a data analytics intro class. I need to reconfigure the data from a UN imported csv file to resemble my other csv files, e.g., cols 'Country or Area', 'Year', 'Value' with rows first grouped by country and then sorted by year decreasing and the values (HDI value) during those years.

The UN data csv I need to covert is configured cols 'Counrty or Area','1990','1991','1992'...'2014', 'HDI Rank'. The rows contains each country name and the values for each col year, and lastly the overall HDI Rank value.

I am using jupyter Projects notebook and don't know the code to manipulate the imported UN csv file data so that I can export a csv that has cols 'Country or Area', 'Year', 'Value', where the rows are sorted first by Country name (duplicated downward), then years descending, and then values for those years.

Help in the right direction is much appreciated!
what the heck is an UN CSV file, how does it differ from a CSV file?

Now for the code. We won't do your homework for you, you need to make an effort, then we will help with problems as you encounter them.
(Jul-23-2018, 08:58 PM)Larz60+ Wrote: what the heck is an UN CSV file
I guess a csv file from UN (United Nations) statistical database :-)
Correct, UN is United Nations. Sorry, I should have been more clear. This is for a project in an intro class where I want to go beyond what we have covered/are responsible for. I don't need all the code, just need some guidance in regard to what libraries to use, e.g., panda re-index?? I can research how to use the libraries but just not sure were to start.
A csv file is text, so can be read like a text file.
But it's better to read as a csv.reader object as each line will be split into a list
to use csv package, you do something like:
import csv

def read_file(filename):
    with open(filename, 'r') as f:
        reader = csv.reader(f)
        for row in reader:
            for col in row:
                print('{0:<15} '.format(col), end='')

if __name__ == '__main__':
if the first row contains headers, you can skip it, or use it for format the data into a dictionary or some other structure.
Thanks for the response. Someone else pointed me to the stack() function which did the trick for the inexperienced newbie.
stack function? please explain how this reads a CSV file.
I can pd.read_csv() files into a df just fine, but needed to reformat the df so it had the same formatting as the other dfs I had by importing the other csv files I'm working with.
that doesn't explain stack function, but nevermind

Top Page

Forum Jump:

Users browsing this thread: 1 Guest(s)