Python Forum

Full Version: ValueError: x and y must have same first dimension, but have shapes (11,) and (15406,
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
Hello, any idea how can I fix this error?

Error:
ValueError: x and y must have same first dimension, but have shapes (11,) and (15406, 1)
I am trying to plot... I tried also to convert the x values and y values by using np.array without any success.
You are so scarce on information, so
https://numpy.org/doc/stable/reference/g...shape.html

reshaping may or may not solve your problem, depending on what you actually want to plot.
Ok, so I have an excel with 15000 measurements, (1st column:timestamps, 2nd column values). I want to represent them to a plot, where on the y-axis will be the measurements (the graph) and on the x-axis totally 20-25 values. I cannot obviously put 15000 xticks. I follow this code:

import datetime as dt
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

time = mdates.drange(dt.datetime(2014, 12, 20), dt.datetime(2015, 1, 2),
                     dt.timedelta(hours=2))
y = np.random.normal(0, 1, time.size).cumsum()
y -= y.min()

fig, ax = plt.subplots(figsize=(8, 6))
ax.plot(time, y, 'bo-')
ax.set(title='Active Calls', ylabel='Calls', xlabel='Time')
ax.grid()

ax.xaxis_date() # Default date formatter
fig.autofmt_xdate()

plt.show()
and I try to do modifications to the code in order to make it work...but I got errors like the one above. Can you help me?
First of all - do you really want 15000 points on that plot? Do you really have 15000 distinct dates?
You can explicitly set the xticks - look at https://stackoverflow.com/q/12608788/4046632
Here is an example as far as I can understand what your goal is

import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import random

x = mdates.drange(dt.datetime(2021, 3, 1, 0, 0, 0), dt.datetime(2021, 3, 3, 0, 0, 0),
                     dt.timedelta(hours=1))
y = [random.randint(0,100) for _ in range(len(x))]
time = mdates.drange(dt.datetime(2021, 3, 1, 0, 0, 0), dt.datetime(2021, 3, 3, 1, 0, 0),
                     dt.timedelta(hours=6))
 
fig, ax = plt.subplots(figsize=(8, 6))
plt.xticks(time)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%m-%d %H:%M"))
# ax.xaxis.set_minor_formatter(mdates.DateFormatter("%m-%d %H:%M"))
ax.plot(x, y, 'bo-')
ax.set(title='Active Calls', ylabel='Calls', xlabel='Time')
ax.grid()
 
ax.xaxis_date() # Default date formatter
fig.autofmt_xdate()
 
plt.show()
So, I tried a mix of the code you posted above + stackoverflow code from your link. This is what I am trying:

time = mdates.drange(dt.datetime(2021, 3, 1, 0, 0, 0), dt.datetime(2021, 3, 3, 1, 0, 0), dt.timedelta(hours=6))
y = trainPredictPlot
fig, ax = plt.subplots()
ax.plot(time,y)
start, end = ax.get_xlim()
ax.xaxis.set_ticks(np.arange(start, end, 10))
ax.xaxis.set_major_formatter(ticker.FormatStrFormatter('%0.1f'))
plt.show()
The error I got a similar error:
Error:
ValueError: x and y must have same first dimension, but have shapes (9,) and (15406, 1)
The problem starts at line 4. It does not run from line 4 and down...
Again you are doing the same mistake. In ax.plot first argument which is X values must be the same size as the second one - y values. It must be timestamps from your file. time should be used only for xticks.
So, I should somehow transform the excel column of the timestamps to an array? np.array?
is it np.array is up to you. it can be pandas dataframe or simple list
e.g.


import datetime as dt
import pandas as pd 
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import random

days = mdates.HourLocator(byhour=(0, 12))   # every 12 hours
hours = mdates.HourLocator()

df = pd.read_excel('data.xlsx')
 
fig, ax = plt.subplots(figsize=(8, 6))
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%m-%d %H:%M"))
ax.xaxis.set_minor_locator(hours)
ax.plot(df.x, df.y, 'bo-')
ax.set(title='Active Calls', ylabel='Calls', xlabel='Time')
ax.grid()
 
ax.xaxis_date() # Default date formatter
fig.autofmt_xdate()
 
plt.show()
[attachment=1056]


import datetime as dt
import pandas as pd 
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import random

df = pd.read_excel('data.xlsx')
time = mdates.drange(dt.datetime(2021, 3, 1, 0, 0, 0), dt.datetime(2021, 3, 3, 1, 0, 0),
                     dt.timedelta(hours=6))
 
fig, ax = plt.subplots(figsize=(8, 6))
plt.xticks(time)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%m-%d %H:%M"))
ax.plot(df.x, df.y, 'bo-')
ax.set(title='Active Calls', ylabel='Calls', xlabel='Time')
ax.grid()
 
ax.xaxis_date() # Default date formatter
fig.autofmt_xdate()
 
plt.show()
[attachment=1057]
I ran the first (of the two) piece of code, and I got this error. Instead of df I use dataframe command.

Error:
return object.__getattribute__(self, name) AttributeError: 'DataFrame' object has no attribute 'x'
So to be more specific it does not run line 16 and down. Is it something that has to do with the excel? The excel file has 4 columns. The first is the timestamps, and the other 3 the measurements.

Any idea?
Pages: 1 2