 Estimating standard deviation from DataSet - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/Forum-Python-Coding) +--- Forum: Data Science (https://python-forum.io/Forum-Data-Science) +--- Thread: Estimating standard deviation from DataSet (/Thread-Estimating-standard-deviation-from-DataSet) Estimating standard deviation from DataSet - jomardee - Jan-22-2018 SO I am having a hard time trying to calculate the standard deviation given the graph. I was wondering what the steps were? Here is a code of what I have so far, but is not getting the right output. the DataSet contains thousands of random numbers such as: 9.8254457980e-1 1.0293906530e+0 8.6314178340e-1 8.7754757930e-1 8.2216021950e-1 9.8155318390e-1 1.0215753050e+0 1.0064994180e+0 1.0300426240e+0 8.7195144970e-1 9.4140464140e-1 1.0811751280e+0 8.5982980390e-1 #Worksheet 1.3A - make plot of DataSet vs row Nums from numpy import * from matplotlib.pyplot import * import matplotlib.pyplot as plt -------------------------------------------------------------- # Import data as a list of numbers with open("DataSet1.dat", "r") as textFile: data = textFile.read().split() # split based on spaces data = [float(point) for point in data] # convert strings to floats rms = 0 #This is my code for calculating the square of the deviation sqrt_rms = square(rms) variance = average([square(i) for i in textFile]) - average1**2 print("The square of the standard deviation is:", variance) print("The standard deviation is:", sqrt(variance)) #^Would square root of variance calculate the standard deviation? plt.xlabel('x axis') plt.ylabel('y axis') plt.title('Plot of DataSet vs Row Numbers') plt.plot(data) plt.show() RE: Estimating standard deviation from DataSet - Larz60+ - Jan-23-2018 Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button. Also do you have more information on the data set? Is there an assignment sheet, or data specification? The data doesn't really look random from the small sample that you are showing. Most of the data points are between 8 e-1 and 9 e-1 with some 'noise' as low readings. You use the term row in your code, so I am guessing that the data set is an array, correct? I'm thinking you may need something like numpy.std: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.std.html RE: Estimating standard deviation from DataSet - jomardee - Jan-24-2018 Here is the code, apologize for the delay. Yes it is also a bunch of noise where the problem asks; Make a plot of your data in DataSet1 vs. row number. From your plot, estimate the standard deviation of your data, i.e. how far the points scatter from the average and record your value. Now write a program to calculate the standard deviation (don’t use built in functions this time). Discuss how well your estimate matches your calculated value. Every time we make a measurement, a value of the noise gets determined at random. This is a bit like quantum mechanics, where when we make a measurement of, say, position, the value is determined randomly. In quantum mechanics the probability of measuring diﬀerent positions is drawn from a probability distribution given by the complex square of the wave function, P(x) = |Ψ|^2, so that the probability of measuring a position between x1 and x2 is given by the area under the P(x) curve between x1 and x2 ```#Worksheet 1.3A - make plot of DataSet vs row Nums from numpy import * from matplotlib.pyplot import * import matplotlib.pyplot as plt variance = 0.0 standDev = 0.0 summ = 0 sum_sq = 0 average = 0 textFile = open('DataSet1.dat','r') file = textFile.readlines() dataSet = [float(i) for i in file] for i in range(0, len(dataSet)): array1[i] = file.readline() average += (sum(array1)/len(array1) print("The average is:", average) plt.xlabel('x axis') plt.ylabel('y axis') plt.title('Plot of DataSet vs Row Numbers') plt.plot(dataSet) plt.show()``` RE: Estimating standard deviation from DataSet - Larz60+ - Jan-24-2018 Then, since you can't use numpy, you need the algorithm for the std deviation math. here are a few sources: https://www.khanacademy.org/math/probability/data-distributions-a1/summarizing-spread-distributions/a/calculating-standard-deviation-step-by-step https://www.sciencebuddies.org/science-fair-projects/science-fair/variance-and-standard-deviation