Python Forum
How to plot histogram from 2 arrays?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to plot histogram from 2 arrays?
#1
I have two arrays where the first array shows the unique values extracted from a column and the second array stores the frequency of these unique values. How can I correctly plot the histogram?

(array([0, 1, 2, 3, 4], dtype=uint8), array([ 1, 20, 20, 30, 45], dtype=int64))

0,1,2,3,4 are the unique values and the numbers in the second array shows the frequency of each of the value.

I tried to do the code as below but it is not producing the correct histogram

b = ([0,1,2,3,4],[1,20,20,39,45])
x = b[0]
print (x)
y = b[1]
print(y)
plt.hist([x,y], bins='auto') 
plt.show()]
Thank you
Reply
#2
In this case you don't need a hist function, try bar instead:

b = ([0,1,2,3,4],[1,20,20,39,45])
plt.bar(*b)
plt.show()
maybe you would wish to normalize bar heights,

b = ([0,1,2,3,4],[1,20,20,39,45])
b_normalized = (b[0], [k / sum(b[1]) for k in b[1]])
plt.bar(*b_normalized)
plt.show()
Reply
#3
Thanks but I think I will need to elaborate my problem further. I have about 100 files and I am trying to extract one column from each of these files and plot the frequencies of numbers appearing in this column. I tried using the list method but it takes too long and when I am trying to use the array method, I am just not able to get the output as intended. The size of the array may change as some files may only have number 1 to 4 and others 1 to 6 etc. I am not sure how I should append the array when it is looping the next files. Below is my complete code:

for file in hdf_file[1:100]:
    inputf = hdf_folder + file
    if re.search(text,file) and (text1,file):
        with h5py.File(inputf,'r') as g:
            a = np.array(np.unique(g[['No_of_Books'],return_counts=True))
 
plt.bar(a, height=a[1])
plt.show()

(Mar-26-2019, 02:02 PM)python_newbie09 Wrote: Thanks but I think I will need to elaborate my problem further. I have about 100 files and I am trying to extract one column from each of these files and plot the frequencies of numbers appearing in this column. I tried using the list method but it takes too long and when I am trying to use the array method, I am just not able to get the output as intended. The size of the array may change as some files may only have number 1 to 4 and others 1 to 6 etc. I am not sure how I should append the array when it is looping the next files. Below is my complete code:

for file in hdf_file[1:100]:
    inputf = hdf_folder + file
    if re.search(text,file) and (text1,file):
        with h5py.File(inputf,'r') as g:
            a = np.array(np.unique(g[['No_of_Books'],return_counts=True))
 
plt.bar(a, height=a[1])
plt.show()

the output from this code will return as below and the idea is to sum the values for each of the number 1...6 shown in the first array



[[ 1 2 3 4 5 6]
[2348 51 10 3910 10 10]]
[[ 0 1 2 3 4 6]
[ 7 2022 50 10 11160 20]]
[[ 0 1 4 5 6]
[ 5 546 10829 10 10]]
[[ 0 4]
[ 2 7738]]
[[ 0 2 3 4]
[ 8 40 20 170324]]
[[ 0 1 2 3 4 6]
[ 3 3210 50 10 166969 10]]
[[ 0 2 3 4]
[ 6 40 10 8644]]
[[ 0 1 2 3 4]
[ 9 2035 50 10 1514]]
Reply
#4
res = []
for file in hdf_file[1:100]:
    inputf = hdf_folder + file
    if re.search(text,file) and (text1,file):
        with h5py.File(inputf,'r') as g:
            res.append(np.array(np.unique(g[['No_of_Books'],return_counts=True)))
  

size = map(lambda x: len(x[0]), res)
acc = np.zeros(size)
for ix, vals in res:
    acc[ix] += vals
freq = acc / acc.sum()
scale = 100
plt.bar(np.arange(size), height=freq * scale)
plt.show()
This code was not tested.
Reply
#5
(Mar-27-2019, 02:41 AM)scidam Wrote:
res = []
for file in hdf_file[1:100]:
    inputf = hdf_folder + file
    if re.search(text,file) and (text1,file):
        with h5py.File(inputf,'r') as g:
            res.append(np.array(np.unique(g[['No_of_Books'],return_counts=True)))
  

size = map(lambda x: len(x[0]), res)
acc = np.zeros(size)
for ix, vals in res:
    acc[ix] += vals
freq = acc / acc.sum()
scale = 100
plt.bar(np.arange(size), height=freq * scale)
plt.show()
This code was not tested.

Thanks. Tried it but it is producing this error; TypeError: expected sequence object with len >= 0 or a single integer
Any idea why?
Reply
#6
Because I haven't the data, I cann't reproduce the problem and do any tests;

Suppose that res is predefined:

res = [[[1, 2, 3, 4, 5, 6],
[2348 ,51, 10, 3910, 10, 10]],
[[ 1 ,2, 3, 4 ,5, 6],
[ 7 ,2022, 50, 10, 11160, 20]],
[[ 1, 2, 3, 4, 5],
[ 5, 546, 10829, 10, 10]],
[[ 0, 4],
[ 2, 7738]],
[[ 1, 2, 3, 4],
[ 8, 40, 20, 170324]],
[[ 1, 2, 3, 4, 5, 6],
[ 3, 3210, 50, 10, 166969, 10]],
[[ 1, 2, 3, 4],
[ 6, 40, 10, 8644]],
[[ 1, 2 ,3 ,4 ,5],
[ 9 ,2035, 50, 10, 1514]]]
size = max(map(lambda x: len(x[0]), res))
acc = np.zeros(size)
for ix, vals in res:
    acc[np.array(ix)-1] += vals
freq = acc / acc.sum()
scale = 100
plt.bar(np.arange(size), height=freq * scale)
plt.show()
The code I just posted is working fine on my computer; Note, because indicies produced by numpy.unique(...,return_counts=True) starts from 0, you will need to change acc[np.array(ix)-1] += vals to acc[ix] += vals.
There was an error in line size= map...; it should be size = max(map(...)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Unhashable error - Histogram code lsbpython 1 988 Aug-07-2022, 04:02 PM
Last Post: Yoriz
  Plot arrays against each other fabstr1 3 2,097 Sep-14-2021, 03:02 PM
Last Post: deanhystad
  How to plot intraday data of several days in one plot mistermister 3 2,902 Dec-15-2020, 07:43 PM
Last Post: deanhystad
  Help with Plotting Histogram Shimmy 1 38,647 Oct-25-2019, 08:20 AM
Last Post: newbieAuggie2019
  How to plot vertically stacked plot with same x-axis and SriMekala 0 1,917 Jun-12-2019, 03:31 PM
Last Post: SriMekala
  How to: Plot a 2D histogram from N-dim array? StevenZ 1 2,470 Mar-31-2018, 04:08 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020