Python Forum
Estimating standard deviation from DataSet
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Estimating standard deviation from DataSet
#1
Question 
SO I am having a hard time trying to calculate the standard deviation given the graph. I was wondering what the steps were? Here is a code of what I have so far, but is not getting the right output. the DataSet contains thousands of random numbers such as:
9.8254457980e-1 1.0293906530e+0 8.6314178340e-1 8.7754757930e-1 8.2216021950e-1 9.8155318390e-1 1.0215753050e+0 1.0064994180e+0 1.0300426240e+0 8.7195144970e-1 9.4140464140e-1 1.0811751280e+0 8.5982980390e-1
İmage


#Worksheet 1.3A - make plot of DataSet vs row Nums
from numpy import *
from matplotlib.pyplot import *
import matplotlib.pyplot as plt

--------------------------------------------------------------

# Import data as a list of numbers
with open("DataSet1.dat", "r") as textFile:
data = textFile.read().split() # split based on spaces
data = [float(point) for point in data] # convert strings to floats
rms = 0
#This is my code for calculating the square of the deviation
sqrt_rms = square(rms)
variance = average([square(i) for i in textFile]) - average1**2
print("The square of the standard deviation is:", variance)
print("The standard deviation is:", sqrt(variance))
#^Would square root of variance calculate the standard deviation?

plt.xlabel('x axis')
plt.ylabel('y axis')
plt.title('Plot of DataSet vs Row Numbers')
plt.plot(data)
plt.show()
Reply
#2
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.

Also do you have more information on the data set?
Is there an assignment sheet, or data specification?
The data doesn't really look random from the small sample that you are showing.
Most of the data points are between 8 e-1 and 9 e-1 with some 'noise' as low readings.
You use the term row in your code, so I am guessing that the data set is an array, correct?

I'm thinking you may need something like numpy.std: https://docs.scipy.org/doc/numpy-1.13.0/...y.std.html
Reply
#3
Here is the code, apologize for the delay. Yes it is also a bunch of noise where the problem asks;

Make a plot of your data in DataSet1 vs. row number. From your plot, estimate the standard deviation of your data, i.e. how far the points scatter from the average and record your value. Now write a program to calculate the standard deviation (don’t use built in functions this time). Discuss how well your estimate matches your calculated value. Every time we make a measurement, a value of the noise gets determined at random. This is a bit like quantum mechanics, where when we make a measurement of, say, position, the value is determined randomly. In quantum mechanics the probability of measuring different positions is drawn from a probability distribution given by the complex square of the wave function, P(x) = |Ψ|^2, so that the probability of measuring a position between x1 and x2 is given by the area under the P(x) curve between x1 and x2

#Worksheet 1.3A - make plot of DataSet vs row Nums
from numpy import *
from matplotlib.pyplot import *
import matplotlib.pyplot as plt


variance = 0.0
standDev = 0.0
summ = 0
sum_sq = 0
average = 0

textFile = open('DataSet1.dat','r')
file = textFile.readlines()
dataSet = [float(i) for i in file]

for i in range(0, len(dataSet)):
    array1[i] = file.readline()
    average  += (sum(array1)/len(array1)
print("The average is:", average)

plt.xlabel('x axis')
plt.ylabel('y axis')
plt.title('Plot of DataSet vs Row Numbers')
plt.plot(dataSet)
plt.show()
Reply
#4
Then, since you can't use numpy, you need the algorithm for the std deviation math.
here are a few sources: https://www.khanacademy.org/math/probabi...ep-by-step
https://www.sciencebuddies.org/science-f...-deviation
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Estimating transition matrices in python Shreya10o 1 2,068 May-14-2020, 10:41 PM
Last Post: Larz60+
  creating new time series based on giving mean, standard deviation and skewness Staph 1 3,087 Aug-06-2019, 10:41 PM
Last Post: scidam

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020