Aug-05-2019, 09:39 AM
Thanks Yoriz for prompt help.
Actually data is in csv file came from more than 5000 respondents. one of the variable is method_discussed having more than 5000 data points and these data points may be of any/all combination of items from dictionary
method_names = ['female_condoms', 'emergency', 'male_condoms', 'pill', 'injectables', 'iud', 'male_sterilization', 'female_sterilization'].
For eaxample
rspondent method_discussed
respondent1 female_condoms injectables
respondent2 male_sterilization pill
respondent3 blank (no method)
.
.
.
so on
respondent5000 male_sterilization female_sterilization
.
.
I imported pandas as pd read the csv file and made a dictionary of these 8 methods. I want to generate 8 variables based on name of these 8 items in dictionary whose data points are 0 (absence of particular item in method_discussed) and 1 (presence of particular item in method_discussed), as you have done but not in memory but in same csv file and save it.
I dont want these results in memory as you have done bit in dataframe. Second thing I want to bring in your notice that I dont want to assign method_discuused as you have done for only 5 cases
methods_discussed = [['iud', 'male_condoms', 'pill'],
['male_condoms'],
[],
['female_sterilization', 'male_sterilization'],
['male_sterilization', 'iud', 'injectables']]
as I said it has more than 5000 cases (data points), in other words, method_discussed take any combination of items from dictionary above.
If you need I can send the csv file with expected outcome in EXCEL.
Thanks
Ashish
Actually data is in csv file came from more than 5000 respondents. one of the variable is method_discussed having more than 5000 data points and these data points may be of any/all combination of items from dictionary
method_names = ['female_condoms', 'emergency', 'male_condoms', 'pill', 'injectables', 'iud', 'male_sterilization', 'female_sterilization'].
For eaxample
rspondent method_discussed
respondent1 female_condoms injectables
respondent2 male_sterilization pill
respondent3 blank (no method)
.
.
.
so on
respondent5000 male_sterilization female_sterilization
.
.
I imported pandas as pd read the csv file and made a dictionary of these 8 methods. I want to generate 8 variables based on name of these 8 items in dictionary whose data points are 0 (absence of particular item in method_discussed) and 1 (presence of particular item in method_discussed), as you have done but not in memory but in same csv file and save it.
I dont want these results in memory as you have done bit in dataframe. Second thing I want to bring in your notice that I dont want to assign method_discuused as you have done for only 5 cases
methods_discussed = [['iud', 'male_condoms', 'pill'],
['male_condoms'],
[],
['female_sterilization', 'male_sterilization'],
['male_sterilization', 'iud', 'injectables']]
as I said it has more than 5000 cases (data points), in other words, method_discussed take any combination of items from dictionary above.
If you need I can send the csv file with expected outcome in EXCEL.
Thanks
Ashish