Python Forum
ValueError: Index contains duplicate entries, cannot reshape” error when I try to use
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
ValueError: Index contains duplicate entries, cannot reshape” error when I try to use
#1
I have data which includes id , gender , collected time test name and Test values , Units of measurement

Test Names will include all tests that a patient taken and Value col will have its corresponding test result.

I want to analysis on only certain tests and retrieve corresponding test values from "value" col . The analysis will be on those tests and their values , so I thought it would be good idea to pivot on those test names and test values. However when I add TS col I get an error and adding any other test name in the multiindex code does not throw an error.

Steps:
Steps:

df_s.head(30).dropna()
Here in the below screenshot we can see there multiple test taken for each requisition id:
In the below code Iam only getting tests which I want to do analysis 1#
df_s2 = df_s[df_s['Test'].isin(['TOTAL TRIIODOTHYRONINE (T3)','TOTAL THYROXINE (T4)','FREE THYROID 3','FREE THYROID 4','Human Chorionic Gonadotropin (hCG)','BILRUBIN'])]
2# Resetting the index:
df_s3=df_s2.set_index(['ID', 'Name', 'Age', 'Sex', 'CT', 'RT', 'Test', 'Test_Result', 'Units']).reset_index()
3# applything multiindex
idx = pd.MultiIndex.from_arrays([df_s3['ID'], df_s3['Name'], df_s3['Age'], df_s3['Sex'], df_s3['CT'],df_s3['RT'], df_s3['Units'], df_s3['Test'],  ])
#, df_s3['Unit of Measure']
df_s5 = df_s3.set_index(idx).Test_Result.unstack(fill_value='')
df_s5.columns.name = None
df_s6= df_s5.reset_index()
df_s6.head(100)
I get this result if do not add TSH (from Test Col)
Output:
ID Name Age Sex CT RT Units BILRUBIN FREE THYROID 3 FREE THYROID 4 Human Chorionic Gonadotropin (hCG) TOTAL THYROXINE (T4) TOTAL TRIIODOTHYRONINE (T3) 0 RQ0556048140 Madhuri Dev 28 Years Female 8/01/2019 10:30 8/01/2019 14:39 ng/dl 1.93 1 RQ0556048140 Madhuri Dev 28 Years Female 8/01/2019 10:30 8/01/2019 14:39 μg/dl 18.70 2 RQ06916497688 B/O BARSARANI BISWAL 10 Month(s) 15 Day(s) Female 17/06/2019 12:00 17/06/2019 14:41 ng/dl 179.82 3 RQ06916497688 B/O BARSARANI BISWAL 10 Month(s) 15 Day(s) Female 17/06/2019 12:00 17/06/2019 14:41 μg/dl 18.30 4 RQ09026492462 Sri Hemanta Sarkar 46 Years Male 30/04/2018 08:35 30/04/2018 14:30 ng/dl 1.15 5 RQ09026492462 Sri Hemanta Sarkar 46 Years Male 30/04/2018 08:35 30/04/2018 14:30 μg/dl 9.20 6 RQ1001489038840 RENUKA MAHAPATRA 65 Years Female 28/07/2019 08:20 28/07/2019 13:16 ng/dl 90 7 RQ1001489038840 RENUKA MAHAPATRA 65 Years Female 28/07/2019 08:20 28/07/2019 13:16 μg/dl 7.40 8 RQ1004195473943 Mrs Mamata Samantray 45 Years Female 23/09/2017 11:40 23/09/2017 13:13 ng/dl 1.58 9 RQ1004195473943 Mrs Mamata Samantray 45 Years Female 23/09/2017 11:40 23/09/2017 13:13 μg/dl 15.60 10 RQ1009478939089 Sabita Lenka 30 Years Female 11/06/2017 13:00 12/06/2017 10:10 ng/dl 1.78 11 RQ1009478939089 Sabita Lenka 30 Years Female 11/06/2017 13:00 12/06/2017 10:10 μg/dl 12.50 12 RQ1012532242276 Sanjukta Mishra 47 Years Female 19/03/2018 11:30 19/03/2018 16:35 ng/dl 0.66 13 RQ1012532242276 Sanjukta Mishra 47 Years Female 19/03/2018 11:30 19/03/2018 16:35 μg/dl 6.40 14 RQ1013250484240 Mrs Abha Kansari 45 Years Female 27/07/2017 11:20 27/07/2017 12:42 ng/dl NaN 15 RQ1013250484240 Mrs Abha Kansari 45 Years Female 27/07/2017 11:20 27/07/2017 12:42 μg/dl NaN 16 RQ1013716969697 Madhusmita Sahu 17 Years Female 31/07/2017 11:40 31/07/2017 13:38 ng/dl 0.29 17 RQ1013716969697 Madhusmita Sahu 17 Years Female 31/07/2017 11:40 31/07/2017 13:38 μg/dl 0.70 18 RQ10189073348 Sumati Mahapatra 55 Years Female 11/02/2017 09:30 11/02/2017 14:14 ng/dl 0.90 19 RQ10189073348 Sumati Mahapatra 55 Years Female 11/02/2017 09:30 11/02/2017 14:14 μg/dl 9.10 20 RQ101981055296 NARMADA GUPTA 50 Years Female 23/08/2019 09:45 23/08/2019 17:25 ng/dl 105 21 RQ101981055296 NARMADA GUPTA 50 Years Female 23/08/2019 09:45 23/08/2019 17:25 μg/dl 7.10 22 RQ102281766132 Pyari Xalxo 39 Years Female 4/03/2017 10:10 4/03/2017 13:10 mIU/ml 28640 23 RQ102281766132 Pyari Xalxo 39 Years Female 4/03/2017 10:10 4/03/2017 13:10 ng/dl 1.67 24 RQ102281766132 Pyari Xalxo 39 Years Female 4/03/2017 10:10 4/03/2017 13:10 μg/dl 13.10 25 RQ1023270930913 VICTORIA KISPOTTA 42 Years Female 9/09/2019 11:50 9/09/2019 12:22 ng/dl 82 26 RQ1023270930913 VICTORIA KISPOTTA 42 Years Female 9/09/2019 11:50 9/09/2019 12:22 μg/dl 5.78 27 RQ1026366989473 PRATIMA PATNAIK 38 Years Female 8/07/2019 01:15 8/07/2019 16:37 ng/dl 88.55 28 RQ1026366989473 PRATIMA PATNAIK 38 Years Female 8/07/2019 01:15 8/07/2019 16:37 μg/dl 10.70 29 RQ1028984992315 Agastin Horo 40 Years Female 8/02/2017 10:20 8/02/2017 16:44 ng/dl 0.81 ... ... ... ... ... ... ... ... ... ... ... ... ... ... 70 RQ1076842665319 Puspalata Behera 23 Years Female 28/02/2017 11:30 28/02/2017 13:29 μg/dl 7.90 71 RQ1078176595194 Pramadini Chhatoi 37 Years Female 31/03/2018 10:00 31/03/2018 13:54 ng/dl 0.96 72 RQ1078176595194 Pramadini Chhatoi 37 Years Female 31/03/2018 10:00 31/03/2018 13:54 μg/dl 9.80 73 RQ1082829630987 Reena Acharya 52 Years Female 13/11/2017 09:10 13/11/2017 11:49 ng/dl 1.07 74 RQ1082829630987 Reena Acharya 52 Years Female 13/11/2017 09:10 13/11/2017 11:49 μg/dl 11.30 75 RQ1084664624181 NIBEDITA KAR 36 Years Female 29/05/2019 10:14 29/05/2019 10:14 ng/dl 97.25 76 RQ1084664624181 NIBEDITA KAR 36 Years Female 29/05/2019 10:14 29/05/2019 10:14 μg/dl 8.10 77 RQ108506693161 Vinod Hemrom 40 Years Male 24/10/2018 12:00 24/10/2018 16:24 ng/dl 1.26 78 RQ108506693161 Vinod Hemrom 40 Years Male 24/10/2018 12:00 24/10/2018 16:24 μg/dl 11.80 79 RQ109122773470 Tara Bhadur 23 Years Female 23/06/2018 11:30 23/06/2018 15:11 ng/dl 1.35 80 RQ109122773470 Tara Bhadur 23 Years Female 23/06/2018 11:30 23/06/2018 15:11 μg/dl 7.70 81 RQ109263648697 Jyoti Thakur 35 Years Female 15/09/2018 11:30 15/09/2018 16:22 ng/dl 1.01 82 RQ109263648697 Jyoti Thakur 35 Years Female 15/09/2018 11:30 15/09/2018 16:22 μg/dl 9.50 83 RQ1093448652128 PUSPITA MISHRA 23 Years Female 30/04/2019 09:45 30/04/2019 19:37 ng/dl 83 84 RQ1093448652128 PUSPITA MISHRA 23 Years Female 30/04/2019 09:45 30/04/2019 19:37 μg/dl 6.10 85 RQ109359752914 HIRAMANI KACHAP 30 Years Female 14/08/2019 03:00 14/08/2019 18:26 ng/dl 88 86 RQ109359752914 HIRAMANI KACHAP 30 Years Female 14/08/2019 03:00 14/08/2019 18:26 μg/dl 6.50 87 RQ1097475978863 CHULESWARI PATRA 18 Years Female 1/09/2019 10:30 1/09/2019 11:08 ng/dl 88 88 RQ1097475978863 CHULESWARI PATRA 18 Years Female 1/09/2019 10:30 1/09/2019 11:08 μg/dl 6.90 89 RQ1098576134741 S PATEL 29 Years Female 30/07/2017 10:30 31/07/2017 15:17 ng/dl 1.14 90 RQ1098576134741 S PATEL 29 Years Female 30/07/2017 10:30 31/07/2017 15:17 μg/dl 12.70 91 RQ1098887741955 L MUKHI 32 Years Female 5/08/2019 05:00 5/08/2019 18:02 ng/dl 118 92 RQ1098887741955 L MUKHI 32 Years Female 5/08/2019 05:00 5/08/2019 18:02 μg/dl 8.50 93 RQ1099369598030 Amit Sahoo 38 Years Male 28/03/2019 09:14 28/03/2019 09:51 ng/dl 1.88 94 RQ1099369598030 Amit Sahoo 38 Years Male 28/03/2019 09:14 28/03/2019 09:51 μg/dl 10.70 95 RQ1101382949711 MEHROOM NISHA 50 Years Female 29/04/2019 08:55 29/04/2019 10:43 ng/dl 172.77 96 RQ1101382949711 MEHROOM NISHA 50 Years Female 29/04/2019 08:55 29/04/2019 10:43 μg/dl 10.10 97 RQ1103159227767 Mamata Das Mahapatra 30 Years Female 5/01/2019 11:20 5/01/2019 11:20 ng/dl 1.21 98 RQ1103159227767 Mamata Das Mahapatra 30 Years Female 5/01/2019 11:20 5/01/2019 11:20 μg/dl 9.10 99 RQ1114005283147 Manji Kaur 47 Years Female 19/06/2017 09:00 19/06/2017 10:50 ng/dl
Code with TSH test: Retry 1# with TSH

df_s2 = df_s[df_s['Test'].isin(['TOTAL TRIIODOTHYRONINE (T3)','TOTAL THYROXINE (T4)','THYROID STIMULATING HORMONE (TSH)','FREE THYROID 3','FREE THYROID 4','Human Chorionic Gonadotropin (hCG)','BILRUBIN'])]

df_s3=df_s2.set_index(['ID', 'Name', 'Age', 'Sex', 'CT', 'RT', 'Test', 'Test_Result', 'Units']).reset_index()

idx = pd.MultiIndex.from_arrays([df_s3['ID'], df_s3['Name'], df_s3['Age'], df_s3['Sex'], df_s3['CT'],df_s3['RT'], df_s3['Units'], df_s3['Test'],  ])
#, df_s3['Unit of Measure']
df_s5 = df_s3.set_index(idx).Test_Result.unstack(fill_value='')
df_s5.columns.name = None
df_s6= df_s5.reset_index()
df_s6.head(100)
Output:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-63-82c092f6cd99> in <module> 1 idx = pd.MultiIndex.from_arrays([df_s3['ID'], df_s3['Name'], df_s3['Age'], df_s3['Sex'], df_s3['CT'],df_s3['RT'], df_s3['Units'], df_s3['Test'], ]) ----> 2 df_s5 = df_s3.set_index(idx).Test_Result.unstack(fill_value='') 3 df_s5.columns.name = None 4 df_s6= df_s5.reset_index() 5 df_s6.head(100) C:\BhargaviM\MyAnaconda\lib\site-packages\pandas\core\series.py in unstack(self, level, fill_value) 2897 """ 2898 from pandas.core.reshape.reshape import unstack -> 2899 return unstack(self, level, fill_value) 2900 2901 # ---------------------------------------------------------------------- C:\BhargaviM\MyAnaconda\lib\site-packages\pandas\core\reshape\reshape.py in unstack(obj, level, fill_value) 499 unstacker = _Unstacker(obj.values, obj.index, level=level, 500 fill_value=fill_value, --> 501 constructor=obj._constructor_expanddim) 502 return unstacker.get_result() 503 C:\BhargaviM\MyAnaconda\lib\site-packages\pandas\core\reshape\reshape.py in __init__(self, values, index, level, value_columns, fill_value, constructor) 135 136 self._make_sorted_values_labels() --> 137 self._make_selectors() 138 139 def _make_sorted_values_labels(self): C:\BhargaviM\MyAnaconda\lib\site-packages\pandas\core\reshape\reshape.py in _make_selectors(self) 173 174 if mask.sum() < len(self.index): --> 175 raise ValueError('Index contains duplicate entries, ' 176 'cannot reshape') 177 ValueError: Index contains duplicate entries, cannot reshape
Question1 (Retry 1# with TSH ): Please help me with the correct approach, what I understand the error is because once it convert it is not finding any unique index but not sure how to resolve it.

Question2: When I proceeded to go ahead without tsh, after conversion of test col- rows to cols, I get blank values in respective test col ( example T4 col) because 1) the person has taken the test but there is no value in the dataset(python is treating it as Null value and can be imputed/rejected - no issue 2) the patient has not taken this test but has taken atleast one other tests may be T3, hcg etc but not this test- this is considered as string '' . I want to get rid of these rows for amy analysis .. is there an approach while transforming the data to take care of so that I only want the result of the code to have T4 and its value( numeric or null). I do not want a scenario where the person has not taken test at all. OR is there a way to impute these values so I will know the person has taken T4, T3 but not Hcg , bilrubin etc?

Please advise. Long questions but I hope it this explanatory
Output:
Reply


Messages In This Thread
ValueError: Index contains duplicate entries, cannot reshape” error when I try to use - by Smiling29 - Oct-13-2019, 01:51 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  pyscript index error while calling input from html form pyscript_dude 2 938 May-21-2023, 08:17 AM
Last Post: snippsat
  Index error help MRsquared 1 739 May-15-2023, 03:28 PM
Last Post: buran
  I'm getting a String index out of range error debian77 7 2,280 Jun-26-2022, 09:50 AM
Last Post: deanhystad
  Numpy reshape mr_byte31 1 1,078 Apr-22-2022, 06:33 PM
Last Post: deanhystad
  Reshape txt file into particular format using python shantanu97 0 1,397 Dec-10-2021, 11:44 AM
Last Post: shantanu97
  Python Error List Index Out of Range abhi1vaishnav 3 2,239 Sep-03-2021, 08:40 PM
Last Post: abhi1vaishnav
  Strange error ValueError: dimension mismatch Anldra12 0 1,937 Aug-17-2021, 07:54 AM
Last Post: Anldra12
  Index error - columns vs non-column Vinny 3 4,849 Aug-09-2021, 04:46 PM
Last Post: snippsat
  How to resolve Index Error in my code? codify110 6 2,957 May-22-2021, 11:04 AM
Last Post: supuflounder
  Why getting ValueError : Math domain error in trig. function, math.asin() ? jahuja73 3 3,707 Feb-24-2021, 05:09 PM
Last Post: bowlofred

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020