Python Forum

Full Version: Plotting histogram of dataframe column
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have the answer, but I don't understand it:

from matplotlib import pyplot as plt
%matplotlib inline #Jupiter notebook
plt.style.use('seaborn')

# Plot a histogram of combined points
plt.hist(super_bowls.combined_pts)
plt.xlabel('Combined Points')
plt.ylabel('Number of Super Bowls')
plt.show()

# Display the highest- and lowest-scoring Super Bowls
display(super_bowls[super_bowls['combined_pts'] > 70])
Please compare lines 6 and 12. Why does the syntax for plot.hist include a dot between the name of dataframe (super_bowls) and column in that df (combined_pts)?

Line 12, in contrast, uses two brackets to look up (I think) rows in a dataframe for whom the combined_pts column value is greater than 70.

Thanks!

I just learned .columns is a component of DataFrames. I can print(super_bowl.combined_pts) to get a 2-column list of row # and combined points.

This helps me understand line 6.

Why does Line 12 require the double brackets, though?
The dataframe understands slices. So in English line 12 would be Display super_bowls where super_bowls combined points is greater than 70.
(Jul-29-2020, 04:06 PM)jefsummers Wrote: [ -> ]The dataframe understands slices. So in English line 12 would be Display super_bowls where super_bowls combined points is greater than 70.

Why couldn't I have used a dot there since .columns is a component of DF's?
You can.
display(super_bowls[super_bowls.combined_pts > 70])
should work
(Jul-29-2020, 09:01 PM)jefsummers Wrote: [ -> ]You can.
display(super_bowls[super_bowls.combined_pts > 70])
should work

Seriously?? I'm so confused... I'll have to study this.

Thanks Jef!