Python Forum
matplotlib Plotting smooth line with nans
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
matplotlib Plotting smooth line with nans
#1
Hello,

I want to automate a chart (we call it snake charts, screenshot below) that so far we've been building in Excel.
It's a scatter plot with smoothed lines in between where the y-values are just rankings (1-2-3-4...) so that we can determine the order of the attributes.
It turns out that snake charts are not conventional (apparantly we made this thing up?) and I can't figure out how to smooth the line, knowing that we have nan's in the list.
My chart is ready, except for the lines between the dots, those should be smoothed if they can be (if they're connecting more than 2 dots)

I've read a lot of options on how to smooth lines, including that I should "mask" nan's
here: https://stackoverflow.com/questions/5283...ith-pyplot
and here: https://www.adamsmith.haus/python/answer...-in-python
and here: https://www.geeksforgeeks.org/how-to-plo...atplotlib/
and here: https://matplotlib.org/devdocs/gallery/l..._demo.html

...but none of those options seem to be able to solve my issue. Does anyone know how I can do this?
Note: in practice I can have 5 values, then a nan, and then 5 more values. So I can't simply skip the first nan.

import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import pandas as pd
import numpy as np 

data = pd.DataFrame({"brand": ["a", "a", "a", "a", "b", "b", "b", "b"],
                     "attribute": ["attr1", "attr2", "attr3", "attr4", "attr1", "attr2", "attr3", "attr4"],
                     "score": [np.nan, 0.55, 0.25, 0.15, 0.26, 0.45, 0.20, 0.15],
                     "order": [1, 2, 3, 4, 1, 2, 3, 4]})

colours= pd.DataFrame({"brand": ["a", "b"], "hex_color": ["#859F84", "#F57921"]})

element_column = "brand"
elements = data[element_column].unique().tolist()

for element in elements:
    x = data.loc[data[element_column] == element, "score"]
    y = data.loc[data[element_column] == element, "order"]
    colour = colours.loc[colours[element_column] == element, "hex_color"].item()    
    y = np.ma.masked_where(np.isnan(y), y)
    plt.scatter(x, y, c=colour)
    plt.plot(x, y, c=colour)

labels = data[['attribute', 'order']].drop_duplicates().copy()
plt.yticks(labels["order"], labels["attribute"])
plt.show()
What I want:
[Image: 275566857_4846251875423126_7378958004873...e=623032F0]
Reply
#2
Can you get what you want if there are no NANs? Seems like a really small number of points for smoothing to work.
Reply
#3
Hello deanhystad,

Yes, that is actually very simple with a very small tweak (cf code below).
(source: https://www.geeksforgeeks.org/how-to-plo...atplotlib/)

Code result (so with no nan's, I actually want "attr1" not to have data or a line for brand a):
[Image: 275603798_4848250668556580_9305822478926...e=6230BAFF]

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np 
from scipy.interpolate import make_interp_spline


data = pd.DataFrame({"brand": ["a", "a", "a", "a", "b", "b", "b", "b"],
                     "attribute": ["attr1", "attr2", "attr3", "attr4", "attr1", "attr2", "attr3", "attr4"],
                     "score": [0.20, 0.55, 0.25, 0.15, 0.26, 0.45, 0.20, 0.15],
                     "order": [1, 2, 3, 4, 1, 2, 3, 4]})
 
colours= pd.DataFrame({"brand": ["a", "b"], "hex_color": ["#859F84", "#F57921"]})
 
element_column = "brand"
elements = data[element_column].unique().tolist()

#data needs to be sorted for that spline to work
data = data.sort_values(by=['order'])

for element in elements:
    x = data.loc[data[element_column] == element, "score"]
    y = data.loc[data[element_column] == element, "order"]
    colour = colours.loc[colours[element_column] == element, "hex_color"].item()   

    X_Y_Spline = make_interp_spline(y, x)
    Y_ = np.linspace(y.min(), y.max(), 500)
    X_ = X_Y_Spline(Y_)
    plt.plot(X_, Y_, c=colour)
    plt.scatter(x, y, c=colour)
 
labels = data[['attribute', 'order']].drop_duplicates().copy()
plt.yticks(labels["order"], labels["attribute"])
plt.show()
Reply
#4
You are ok with that? That is exactly what I thought would happen and I think it unacceptable. The range of the smooth line is much greater than the range of the points.

If that is ok, you'll likely be fine replacing NAN with a value interpolated from surrounding points: b =(a+c)/2.
Reply
#5
Hm... Not sure I follow what you mean.
What I want is exactly the same graph as in my original post.
Let's assume that "attr2" was missing in my original data, then I expect:
* a single datapoint "attr1" (but no lines here)
* nothing at "attr2"
* a straigth line going from "attr3" to "attr4"

Just to be sure I'm clear: my post was an answer to "can you get what you need if there are no nan's", then yes, the chart that I showed in my reply would be what i wanted
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Graphic line plot with matplotlib, text file in pytho khadija 2 1,395 Aug-15-2022, 12:00 PM
Last Post: khadija
  Matplotlib: How do I convert Dates from Excel to use in Matplotlib JaneTan 1 3,255 Mar-11-2021, 10:52 AM
Last Post: buran
  Multiple Line Chart Plotting moto17 1 2,504 Jan-20-2021, 01:38 PM
Last Post: wostan
  saving only one line of a figure as an image (python matplotlib) nitrochloric 0 2,032 Nov-23-2020, 01:41 PM
Last Post: nitrochloric
  Smooth curve in matplotlib medatib531 0 1,833 Apr-02-2020, 08:07 PM
Last Post: medatib531
  problem in plotting intraday results using matplotlib mr_byte31 0 2,957 Aug-20-2018, 11:32 AM
Last Post: mr_byte31

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020