If you need to group dataset by continents and sum population and count countries (stored in index), you dont need to group by the index, you just need one grouping (by continent), but you need to do two aggregations - sum and count. And if your index (countries) contains only unique values, then country counting is same as counting any column in the dataframe.
You can do it in two steps, when you do sum / count seperately and then you merge results. Or you can do it in one pass with .agg() function.
Spoiled examples:
Example dataframe:
Output:
In [1]: import pandas as pd
In [2]: data = {"continent":["Europe", "Europe", "North America"], "pop":[12313, 2341, 43312]}
In [3]: df = pd.DataFrame(data, index=["Germany", "France", "Canada"])
In [4]: df
Out[4]:
continent pop
Germany Europe 12313
France Europe 2341
Canada North America 43312
Using .agg() :
With .agg() you can use dictionary and define what functions do you want to apply to given columns. Only one column is used here:
Output:
In [5]: df.groupby("continent").agg({"pop":{"country_count":"count", "pop_sum":"sum"}})
Out[5]:
pop
country_count pop_sum
continent
Europe 2 14654
North America 1 43312
Resulting dataframe has column multindex based on dictionary defining aggregation.
"Simpler" approach with separate steps:
Output:
In [6]: country_count = df.groupby("continent").count()
In [7]: country_count
Out[7]:
pop
continent
Europe 2
North America 1
In [8]: pop_sum = df.groupby("continent").sum()
In [9]: pop_sum
Out[9]:
pop
continent
Europe 14654
North America 43312
In [10]: country_count.columns=["country_count"]
In [11]: country_count.join(pop_sum)
Out[11]:
country_count pop
continent
Europe 2 14654
North America 1 43312
You need to rename column to avoid "clash" (or to specify suffix to use as parameter for .join())
If your index is not unique, probably simplest solution is to add index as another column (country) to dataframe and instead count() use nunique() on countries.
And while .agg() is not so well known function,
10 Minutes to pandas contains more than enough informations to deduce separate summing/counting followed by merge.