Posts: 8
Threads: 2
Joined: Mar 2019
Hello there,
I'm kind of new with Python and sometimes I'm lacking some basics.
Here I want to regroup some modalities like them :
For column langue_navigateur
Français 88262
NA 9723
Anglais 1607
Turc 171
Portugais 158
Espagnol 88
Russe 86
Italien 60
Roumain 47
Polonais 47
Allemand 30
Chinois 27
Arabe 23
I want to obtain a output like this :
Français 88262
NA 9723
Others : sum of the others modalities (1607 + 171 +.... + 23)
I think I need a if else statement but I couldn't find the right code to obtain this.
Could you help me ? :)
Thanks.
Posts: 12,031
Threads: 485
Joined: Sep 2016
Show what you have tried. We are glad to help, but will not write the code for you.
Posts: 8
Threads: 2
Joined: Mar 2019
In the beginning I made this kind of code :
Quote:transfosupport = {
"Ordinateur" : "Ordinateur",
"Smartphone" : "Smartphone",
"NA" : "NA",
"Tablette" : "Tablette Console et TV",
"Console" : "Tablette Console et TV",
"TV connect" : "Tablette Console et TV",
}
And
Quote:df['Supports'] = df['Supports'].map(transfosupport)
But too tedious with more than five modalities/
SoI tried this :
Quote:if df['Continents'] == "Français" : "Français"
if df['Continents'] == "NA" : "NA"
else : "Others"
But it obviously miss some details.
Posts: 2,126
Threads: 11
Joined: May 2017
text = """Français 88262
NA 9723
Anglais 1607
Turc 171
Portugais 158
Espagnol 88
Russe 86
Italien 60
Roumain 47
Polonais 47
Allemand 30
Chinois 27
Arabe 23""" This is your startpoint (later you should get the imput from a file or somewhere else.
Assign 0 to a result variable before the loop.
Iterate in a loop over the lines with text.splitlines() .
Then split each line in the loop, take the second element.
The second element is still a str, so you need to convert it to an integer.
Add the integer to the result.
If you have done this manually, try it with sum.
It can be written in only one line to sum the second column.
Posts: 8
Threads: 2
Joined: Mar 2019
I'm sorry, I understand what you're saying but I didn't quite undrstood how to do that.
*Assign 0 to a result variable before the loop.
Ok for this
Quote:j = 0
Iterate in a loop over the lines with text.splitlines().
Something like that ?
Quote:x.text.splitlines() for x in text
*Then split each line in the loop, take the second element.
Something like that ?
Quote:x.txt.split('\') for x in text
*The second element is still a str, so you need to convert it to an integer.
I need to use astype but i don't know how to use onthis specific term.
*Add the integer to the result.
All right
Each thing is quite clear for me, but mixed together, I'm confused :/
Posts: 2,342
Threads: 62
Joined: Sep 2016
Mar-22-2019, 11:54 PM
(This post was last modified: Mar-22-2019, 11:55 PM by micseydel.)
(Mar-22-2019, 12:38 PM)LoliMarth Wrote: *Assign 0 to a result variable before the loop. Ok for this Quote:j = 0 Yes.
(Mar-22-2019, 12:38 PM)LoliMarth Wrote: Iterate in a loop over the lines with text.splitlines(). Something like that ?Quote:x.text.splitlines() for x in text No, more like
for line in text.splitlines(): (Mar-22-2019, 12:38 PM)LoliMarth Wrote: *Then split each line in the loop, take the second element. Something like that ? Quote:x.txt.split('\') for x in text More like (inside the loop body)
x = line.split()[1] (Mar-22-2019, 12:38 PM)LoliMarth Wrote: *The second element is still a str, so you need to convert it to an integer. I need to use astype but i don't know how to use onthis specific term. I don't understand what you're saying here, but you can call int.
(Mar-22-2019, 12:38 PM)LoliMarth Wrote: *Add the integer to the result. All right Each thing is quite clear for me, but mixed together, I'm confused :/ I'm going to hope that my help so far will get you close enough to do this last part yourself.
Posts: 8
Threads: 2
Joined: Mar 2019
Posts: 817
Threads: 1
Joined: Mar 2018
As it could be seen from the context (df - variable), you are using Pandas.
So, you can create Pandas dataframe from string:
from io import StringIO
import pandas as pd
# text variable defined here
data = pd.read_table(StringIO(text), sep='\s', header=None) Further, you can get access to the first column, e.g. data.iloc[:, 0] , apply mapping to it or do anything you want.
data.iloc[2:, -1].sum() returns sum you are trying to find.
Posts: 8
Threads: 2
Joined: Mar 2019
Mar-25-2019, 12:28 PM
(This post was last modified: Mar-25-2019, 12:28 PM by LoliMarth.)
Yeah, that's work,this is great ! :)
I understand how it work, it's a nice trick.
Thanks !
But it doesn't create a another variable with the new modalities though.
My first goal would be to obtain a output like this :
Variable y :
français
Français
NA
NA
français
Others
Français
NA
Others
And so on
I just realise that's my request wasn't right in my first post, sorry :/
I don't want the count, I just want to replace "Français" by "Français", "NA" by "NA" and all the other modalities by "Others" in my variable y.
Sorry for the misunderstanding :/
Posts: 8
Threads: 2
Joined: Mar 2019
Mar-25-2019, 03:18 PM
(This post was last modified: Mar-25-2019, 03:19 PM by LoliMarth.)
I didn't thought about this solution but it work !
I made this :
Quote:Y.loc[:,"NvxY"] = "OTHERS"
df.loc[(df["Y"] == "Français") ,"NvxY"] = "Français"
df.loc[(df["Y"] == "NA") ,"NvxY"] = "NA"
And I drop Y and I obtain a variable NewY with my modalities I wanted ! It's fast and not tedious at all !
Maybe you have some stuff more efficient ?
|