Python Forum

Full Version: matplotlib and line chart with shaded area
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
Dear Python Experts,

I have to plot some data as a line chart but somehow it looks more like a scatter plot.

[Image: plot.jpg]

I am happy that I got this far but I wonder why it is not rendered as 2 lines but rahter individual points.

Is it possible to make the area between the two lines slightly gray?

Any help is much appreciated.
  • it is rendered as markers connected with lines because you explicitly ask for it with "-o" - that means solid line and circular markers
  • to fill area between two curves you can use fill_between()
import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0, 3, 300)
y1 = np.sin(x)
y2 = x/3

plt.plot(x, y1, '-.', x, y2, '--')
plt.fill_between(x, y1, y2, where=y1>=y2, facecolor='gold')
plt.fill_between(x, y1, y2, where=y2>=y1, facecolor='tan')
[Image: ovZsLjZ.png]
Hi zivoni,

Thanks for your reply. Amazing. You know everything :)

My TMAXx_ and TMINx_ look like the following

    Data_Value
0    44.893617
1    39.872340
2    32.723404

x is the days of the year the 0 should be the first day and it goes up to 365.
Whenever I try to pass the index of TKMAXx_ in I get to read

“ValueError: Argument dimensions are incompatible”
Check shape of your arguments. Without actual code its hard to say what is wrong, but shape of your x, y data might depend on selection:

Output:
In [30]: df = pd.DataFrame({'A':[11, 13, 16, 17], 'B':[16, 15, 18,18]}) In [31]: df.index.shape Out[31]: (4,) In [32]: df['A'].shape Out[32]: (4,) In [33]: df[['A']].shape Out[33]: (4, 1) In [34]: df.A.shape Out[34]: (4,)
Check that you have same dimensions.

And it is not necessary to use index 0.. len(df)-1, you can use
plt.plot(range(len(df)), df.A)
And if you have custom labels (or index with actual values), xticks can be used
plt.xticks(range(len(df)), ['jan', 'feb', 'mar', 'apr'])
Hi zivoni,
Thanks for your reply.
Here is my code.


%matplotlib notebook
def linegraph():
    NOAA = data_summary()
    TMAX = NOAA[['Data_Value']].where(NOAA['Element'] =='TMAX')
    TMAXx = TMAX.dropna()
    TMAXx_=TMAXx.reset_index(drop=True)
    DAYS = TMAXx_.index.tolist()
    TMIN = NOAA[['Data_Value']].where(NOAA['Element'] =='TMIN')
    TMINx = TMIN.dropna()
    TMINx_= TMINx.reset_index(drop=True)
    plt.figure()
    # plot the TMAX and the TMIN data
    plt.plot(TMAXx_, '-', TMINx_, '-')  
    #plt.plot([22,44,55], '--r') ADD 2015
    
    
    time = np.linspace(0,1,100)
    y = np.sin(time*10)
    y1 = y - 0.5
    y2 = y + 0.5
    
    
    #plt.fill_between(time, TMAXx_, y2, color='grey', alpha='0.5')
    #plt.fill_between(x,TMAXx_, TMINx_, where=TMAXx_>=TMINx_, facecolor='gray')
    return TMAXx_
linegraph()
As far as I see TMAXx_ and TMINx_ are both dataframes of shape index,value e.g.

[Image: plot.png]

I dont understand what the first argument of the fill_between is about:

plt.fill_between(x, y1, y2, where=y1>=y2, facecolor='gold') My x axis value is the index i.e. the 365 days and the y is the temperature in the data value
column of TMAXx_ and TMINx_
From you code it is clear that shapes of index and your TM... are different...

While index has a shape (n,) (one dimension), your TM... are dataframes with a shape (n,1) (two dimensional objects) and that is not allowed as a plt.fill_between argument. Select your variables as a Series, that should work.

fill_between() requires three arguments - x (values on x axis), y1 and y2 (your "curves" to fill between them). x, y1, y2 generally should be arrays with same length (y1 or y2 could be scalars, for example you can fill area between curve and x axis with one "y" equal to 0). Other paramaters are optional, where is used to limit what part should be filled (default is to fill "everything" between, so both areas where y1 > y2 or y2>y1)

Using
df = df.where(df.column=='something')
df = df.dropna()
  to remove rows that dont fullfill condition looks a little weird, why not just use your condition to select directly?
df = df[df.column=='something']
Hi zivoni,
Many thanks for your reply.
I slected both as numpy array but the error remains the same:

#T... as Array
TMAXSERIES= TMAXx_[TMAXx_.columns[0]].values
TMINSERIES= TMINx_[TMINx_.columns[0]].values

plt.fill_between(time,TMAXSERIES, TMINSERIES, where=TMAXSERIES>=TMINSERIES, facecolor='gray')

All, time, TMAXSERIES and TMINSERIES are of type numpy array.
Its hard to say what is wrong without seeing actual data and error message. Are shapes of your inputs identical, including time? Following code should return True
time.shape == TMAXSERIES.shape == TMINSERIES.shape
Thank you. I got it. My time was of the wrong type.
FYI - Free for next 7 hours: 'Mastering matplotlib' packt publishing: https://www.packtpub.com/packt/offers/free-learning
Pages: 1 2