Python Forum

Full Version: Plotting sum of data files using simple code
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hey,

I'm trying to plot the sum of 25 data files with 5 columns. The code is this:

import matplotlib.pyplot as plt
import numpy as np
from glob import glob

fnames = glob("C:/.../..._*.hst")
data = [np.loadtxt(f) for f in fnames]

x = np.sum(data[:,0])/25
y1 = np.sum(data[:,1])/25
y2 = np.sum(data[:,2])/25

plt.plot(x,y1, label='sync1')
plt.plot(x,y2, label='sync2')
plt.legend()
plt.show()
The error is:

    x = np.sum(data[:,0])/25

TypeError: list indices must be integers or slices, not tuple
Is it so that the data I've created are now arrays, or what could be the issue? I'm trying to sum data from a specific column in all files and plot it against data from another column from all files.
It's the comma in x = np.sum(data[:,0])/25 that's causing the error but there seems to be other things wrong with you code. I'm not sure so please let me know if this is what you were trying to accomplish.

import matplotlib.pyplot as plt
import numpy as np
from glob import glob
 
fnames = glob("C:/.../..._*.hst")
data = [np.loadtxt(f) for f in fnames]
sum_array = np.sum (data, axis = 0)

x = sum_array [0]
y1 = sum_array [1]
y2 = sum_array [2]

plt.plot((x,y1), label='sync1')
plt.plot((x,y2), label='sync2')
plt.legend()
plt.show()
(Jun-10-2021, 05:58 PM)BashBedlam Wrote: [ -> ]It's the comma in x = np.sum(data[:,0])/25 that's causing the error but there seems to be other things wrong with you code. I'm not sure so please let me know if this is what you were trying to accomplish.

import matplotlib.pyplot as plt
import numpy as np
from glob import glob
 
fnames = glob("C:/.../..._*.hst")
data = [np.loadtxt(f) for f in fnames]
sum_array = np.sum (data, axis = 0)

x = sum_array [0]
y1 = sum_array [1]
y2 = sum_array [2]

plt.plot((x,y1), label='sync1')
plt.plot((x,y2), label='sync2')
plt.legend()
plt.show()

Okay so now I do get values out, but I should be getting a spectrum. Let me rephrase what I'm trying to do:

I have files that have 5 columns. I want to sum the numbers in a specific column (columns 1-3), so for 1 file I can use np.sum(data[:,0]), get a single number as output, but then I want to sum this value from all, f.ex. 100 files, and then divide by the number of files to get the average value. So to demonstrate, the sum of all values in one file for column1 tells the number of counts in an experiment in a specific detector, and by summing this for all 100 repeated experiments (and dividing by 100) I get the average number of counts in that detector. I tried pandas df, but the files are .hst so read_cvs doesn't skip commented lines that are in the beginning of each file (with np.loadtxt there is no such problem).

So now the output is like this:

[  0.00000000e+00   9.00000000e+00   9.00000000e+00  -3.75000000e+05
   3.30000000e+01]
[  6.25000000e+02   7.00000000e+00   9.00000000e+00  -3.74375000e+05
   2.80000000e+01]
[  1.25000000e+03   1.30000000e+01   1.70000000e+01  -3.73750000e+05
   2.30000000e+01]
So I'm guessing what I'm trying to do with this code is not at all the purpose I'm trying to achieve, I should've explained a bit more clearly - basically I want as an output three values, the sum of the sum of columns 1, 2, 3 in each file divided by the number of files read, and then I can plot them, avg. count rates for detectors 1&2 as a function of avg. time. Can this be done using np.sum/np.loadtxt by creating some sort of loop that would first sum the columns' values in all files separately and then put them together?
So... x is the sum of column one from all of the files and len (fnames) will be the number of files read. That makes x / len (fnames) the average that you're looking for... Right?