##### Mann Whitney U-test on several data sets
 Mann Whitney U-test on several data sets rybina Programmer named Tim Posts: 5 Threads: 3 Joined: Jan 2021 Reputation: Jan-05-2021, 02:00 PM Hi, I'm really struggling to find a way to do the following: Suppose I have two groups of data sets (fictitious in this example): group_a = [1, 5, 7, 3, 5, 8, 34] group_b = [1, 2, 4, 3, 5, 8, 45] group_c = [1, 5, 7, 3, 5, 8, 35] group_1 = [1, 2, 7, 3, 5, 8, 56] group_2 = [1, 5, 7, 3, 5, 8, 23] group_3 = [1, 4, 6, 3, 5, 8, 25] group_4 = [1, 5, 7, 8, 5, 8, 45] group_5 = [1, 3, 7, 3, 5, 8, 15] group_6 = [1, 5, 7, 3, 5, 8, 16] and I need to perform a Mann Whitney U-test on all possible combinations of the letter and number groups, that is; I want a result for all the following combinations: (group_a, group_1) (group_a, group_2) (group_a, group_3) (group_a, group_4) (group_a, group_5) (group_a, group_6) (group_b, group_1) (group_b, group_2) (group_b, group_3) (group_b, group_4) (group_b, group_5) (group_b, group_6) (group_c, group_1) (group_c, group_2) (group_c, group_3) (group_c, group_4) (group_c, group_5) (group_c, group_6) (But in reality there are more letter groups and many more number groups). Is there an efficient way to do this? Unfortunately I'm quite new to Python and am self taught. Any advice regarding this would be really appreciated. Additionally; a lot of my work requires doing comparisons like this, so any suggestions of books, courses, anything at all that would help me with this would also be amazing. I currently work as a Data Analyst looking to transition into Statistics, (hence I'm trying to perform my regular work to a higher level and trying to use Python as much as I can going forward). Thanks. Reply DeaD_EyE Da Bishop Posts: 1,933 Threads: 9 Joined: May 2017 Reputation: Jan-05-2021, 02:20 PM You want to have better data structures and product does what the name says. It makes the Cartesian product of iterables: https://docs.python.org/3/library/iterto...ls.product ```from itertools import product groups_first = [ [1, 5, 7, 3, 5, 8, 34], [1, 2, 4, 3, 5, 8, 45], [1, 5, 7, 3, 5, 8, 35], ] groups_second = [ [1, 2, 7, 3, 5, 8, 56], [1, 5, 7, 3, 5, 8, 23], [1, 4, 6, 3, 5, 8, 25], [1, 5, 7, 8, 5, 8, 45], [1, 3, 7, 3, 5, 8, 15], [1, 5, 7, 3, 5, 8, 16], ] print("Without indicies") for first_group, second_group in product(groups_first, groups_second): print(first_group, second_group) print() print("With indicies") # to get for groups_first and groups_second you can use enumerate iterator = product(enumerate(groups_first), enumerate(groups_second)) # in addition you can use tuple unpacking for (first_idx, first_group), (second_idx, second_group) in iterator: print(first_idx, first_group, second_idx, second_group) # this won't work, because the first_idx and first group is a tuple # same for the second_idx and second_group #for first_idx, first_group, second_idx, second_group in iterator: # print(first_idx, first_group, second_idx, second_group)```Quote:Without indicies [1, 5, 7, 3, 5, 8, 34] [1, 2, 7, 3, 5, 8, 56] [1, 5, 7, 3, 5, 8, 34] [1, 5, 7, 3, 5, 8, 23] [1, 5, 7, 3, 5, 8, 34] [1, 4, 6, 3, 5, 8, 25] [1, 5, 7, 3, 5, 8, 34] [1, 5, 7, 8, 5, 8, 45] [1, 5, 7, 3, 5, 8, 34] [1, 3, 7, 3, 5, 8, 15] [1, 5, 7, 3, 5, 8, 34] [1, 5, 7, 3, 5, 8, 16] [1, 2, 4, 3, 5, 8, 45] [1, 2, 7, 3, 5, 8, 56] [1, 2, 4, 3, 5, 8, 45] [1, 5, 7, 3, 5, 8, 23] [1, 2, 4, 3, 5, 8, 45] [1, 4, 6, 3, 5, 8, 25] [1, 2, 4, 3, 5, 8, 45] [1, 5, 7, 8, 5, 8, 45] [1, 2, 4, 3, 5, 8, 45] [1, 3, 7, 3, 5, 8, 15] [1, 2, 4, 3, 5, 8, 45] [1, 5, 7, 3, 5, 8, 16] [1, 5, 7, 3, 5, 8, 35] [1, 2, 7, 3, 5, 8, 56] [1, 5, 7, 3, 5, 8, 35] [1, 5, 7, 3, 5, 8, 23] [1, 5, 7, 3, 5, 8, 35] [1, 4, 6, 3, 5, 8, 25] [1, 5, 7, 3, 5, 8, 35] [1, 5, 7, 8, 5, 8, 45] [1, 5, 7, 3, 5, 8, 35] [1, 3, 7, 3, 5, 8, 15] [1, 5, 7, 3, 5, 8, 35] [1, 5, 7, 3, 5, 8, 16] With indicies 0 [1, 5, 7, 3, 5, 8, 34] 0 [1, 2, 7, 3, 5, 8, 56] 0 [1, 5, 7, 3, 5, 8, 34] 1 [1, 5, 7, 3, 5, 8, 23] 0 [1, 5, 7, 3, 5, 8, 34] 2 [1, 4, 6, 3, 5, 8, 25] 0 [1, 5, 7, 3, 5, 8, 34] 3 [1, 5, 7, 8, 5, 8, 45] 0 [1, 5, 7, 3, 5, 8, 34] 4 [1, 3, 7, 3, 5, 8, 15] 0 [1, 5, 7, 3, 5, 8, 34] 5 [1, 5, 7, 3, 5, 8, 16] 1 [1, 2, 4, 3, 5, 8, 45] 0 [1, 2, 7, 3, 5, 8, 56] 1 [1, 2, 4, 3, 5, 8, 45] 1 [1, 5, 7, 3, 5, 8, 23] 1 [1, 2, 4, 3, 5, 8, 45] 2 [1, 4, 6, 3, 5, 8, 25] 1 [1, 2, 4, 3, 5, 8, 45] 3 [1, 5, 7, 8, 5, 8, 45] 1 [1, 2, 4, 3, 5, 8, 45] 4 [1, 3, 7, 3, 5, 8, 15] 1 [1, 2, 4, 3, 5, 8, 45] 5 [1, 5, 7, 3, 5, 8, 16] 2 [1, 5, 7, 3, 5, 8, 35] 0 [1, 2, 7, 3, 5, 8, 56] 2 [1, 5, 7, 3, 5, 8, 35] 1 [1, 5, 7, 3, 5, 8, 23] 2 [1, 5, 7, 3, 5, 8, 35] 2 [1, 4, 6, 3, 5, 8, 25] 2 [1, 5, 7, 3, 5, 8, 35] 3 [1, 5, 7, 8, 5, 8, 45] 2 [1, 5, 7, 3, 5, 8, 35] 4 [1, 3, 7, 3, 5, 8, 15] 2 [1, 5, 7, 3, 5, 8, 35] 5 [1, 5, 7, 3, 5, 8, 16] Almost dead, but too lazy to die: https://sourceserver.info All humans together. We don't need politicians! Reply rybina Programmer named Tim Posts: 5 Threads: 3 Joined: Jan 2021 Reputation: Jan-05-2021, 03:08 PM Hi, many thanks for this, it may be what I need to get through this. Thanks again. Reply

 Possibly Related Threads… Thread Author Replies Views Last Post replace sets of values in an array without using loops paul18fr 7 967 Jun-20-2022, 08:15 PM Last Post: paul18fr Data sets comparison Fraetos 0 1,055 Sep-14-2021, 06:45 AM Last Post: Fraetos Generate Test data (.csv) using Pandas Ashley 5 2,390 Jun-15-2020, 02:51 PM Last Post: jefsummers Least-squares fit multiple data sets multiverse22 1 1,949 Jun-06-2020, 01:38 AM Last Post: Larz60+ Partitioning when splitting data into train and test-dataset Den0st 0 1,658 Dec-07-2019, 08:31 PM Last Post: Den0st Clustering for imbalanced data sets dervast 0 1,302 Sep-25-2019, 06:34 AM Last Post: dervast Compare 2 Csv data sets, identify record with latest date MJUk 11 5,134 Jan-06-2018, 09:23 PM Last Post: MJUk Match two data sets based on item values klllmmm 7 5,497 Mar-29-2017, 02:33 PM Last Post: zivoni

Forum Jump:

### User Panel Messages

##### Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020