Python Forum

Full Version: Matplot / numpy noisy data problem
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi there,
I seem to be having a weird issue with my python when using numpy, pandas and matplot lib. I have a basic time series data that is uploaded from a CSV, but when i plot it, the chart does not look like the chart should look (a standard time series line chart), instead it has tons of noise. The data is just the closing price of the S&P, but the chart doesnt look like it should and its making my analysis stall... see attached for some snippits of the issue. Ive included the export plot chart, the way it looks in excel/should look, and what the CSV file raw data looks like. Any help would be appreciated!

[attachment=2247]
[attachment=2248]
[attachment=2249]
Post the code for reading the csv file. Try plotting 1 month of data. Maybe the plot will reveal a problem, or demonstrate that there is not a problem.

I think the problem is you are reading one of your columns as a date, but not using the correct format. I can demonstrate this using pandas to create a csv file and read it back using the wrong date format.
import pandas as pd
import matplotlib.pyplot as plt

# Make a dataframe
dates = pd.date_range(start='1/1/2000', end='12/31/2022')
df = pd.DataFrame({"date": dates, "value": list(range(len(dates)))})

# Write to csv file using day first format
df.to_csv('data.csv', index=False, date_format='%d/%m/%Y')

# Read from csv without using day first
df2 = pd.read_csv('data.csv', parse_dates=["date"])

# Read from csv using day first
df3 = pd.read_csv('data.csv', parse_dates=["date"], dayfirst=True)

df.plot(x='date', y='value', title="Original Dataframe")
df2.plot(x='date', y='value', title='read_csv(dayfirst=False)')
df3.plot(x='date', y='value', title='read_csv(dayfirst=True)')
plt.show()