Python Forum

Full Version: error merge csv file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
hello, I tried to merge 2 csv file. the same column that to merge is 'sel' variable. But, I got error message. this is my code and the error message that i got.

import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt

dataWNT = pd.read_csv('WNT.csv')
WNT = pd.DataFrame(dataWNT)

dataMutasi = pd.read_csv('mutasi_sel_Oke.csv', error_bad_lines=False)
Mutasi = pd.DataFrame(dataMutasi)

joinWNT_Mutasi = WNT.merge(Mutasi,on='sel')

print(joinWNT_Mutasi)
and here the error message:

Error:
KeyError Traceback (most recent call last) <ipython-input-1-8f18d8599b3d> in <module> 10 Mutasi = pd.DataFrame(dataMutasi) 11 ---> 12 joinWNT_Mutasi = WNT.merge(Mutasi,on='sel') 13 14 print(joinWNT_Mutasi) ~\anaconda3\lib\site-packages\pandas\core\frame.py in merge(self, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate) 7944 from pandas.core.reshape.merge import merge 7945 -> 7946 return merge( 7947 self, 7948 right, ~\anaconda3\lib\site-packages\pandas\core\reshape\merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate) 72 validate=None, 73 ) -> "DataFrame": ---> 74 op = _MergeOperation( 75 left, 76 right, ~\anaconda3\lib\site-packages\pandas\core\reshape\merge.py in __init__(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, copy, indicator, validate) 650 self.right_join_keys, 651 self.join_names, --> 652 ) = self._get_merge_keys() 653 654 # validate the merge keys dtypes. We may need to coerce ~\anaconda3\lib\site-packages\pandas\core\reshape\merge.py in _get_merge_keys(self) 1016 right_keys.append(rk) 1017 if lk is not None: -> 1018 left_keys.append(left._get_label_or_level_values(lk)) 1019 join_names.append(lk) 1020 else: ~\anaconda3\lib\site-packages\pandas\core\generic.py in _get_label_or_level_values(self, key, axis) 1561 values = self.axes[axis].get_level_values(key)._values 1562 else: -> 1563 raise KeyError(key) 1564 1565 # Check for duplicates KeyError: 'sel'
please anyone help me.
The error says one of the dataframes does not contain the key (column) "sel". Print out WNT and Mutasi to verify that they both contain the key.
thank you for the answer. I already correct it. and It works. But I got new error message.

Error:
Empty DataFrame Columns: [sel, NILAI_RESPON, RESPON, NADK_ENST00000342348, CAMK1G, NTF3, ANKRD30A_ENST00000602533, SFRP5, CNTN5_ENST00000524871, MICAL2_ENST00000379612, FCRL2, OTOGL_ENST00000458043, LRRIQ1_ENST00000393217, LRRIQ1, HEATR1, ANKLE2, SLC11A2, KIAA1199, HAPLN3, UCP2, FRMD4A, TECPR2, TGM1, OR5A2, MS4A8B, C11orf66, PLEKHS1, TXNRD1_ENST00000378070, POSTN, ATP10A, RP11-578F21.5, PLEKHM2_ENST00000375799, PLEKHM2, PDE3B, NUCB2, OTOG, CLSTN3, HYOU1, CUBN, MAN2A2, PGPEP1L, TXNDC11, IPMK, ANK3_ENST00000395293, SLC29A3, CDC42BPG_ENST00000342711, CDC42EP2, HIVEP3, RASAL1, NOS1, CAMKK2, CAMKK2_ENST00000392473, TMEM120B_ENST00000449592, MAP1A, FAM179B, PPFIBP1_ENST00000318304, PPFIBP1, FLG, C15orf33, TCF12_ENST00000438423, TCF12_ENST00000343827, GPR19, TMEM132C_ENST00000435159, HNRNPCL1, PRDM2, ZNF692, SLC35F4, SLC35F4_ENST00000339762, MCF2L_ENST00000375608, LTK, ADAMTS18, MUC16, MUC16_ENST00000397910, CSRP2BP, UBE2M, TCEAL5, WDR81, WDR81_ENST00000437219, TP53_ENST00000269305, DNAH2_ENST00000570791, ZNF536, RHPN2, ALDH16A1, FCGRT, LILRB1, SIGLEC1_ENST00000202578, NFATC2, NDUFA5, CNTNAP3B, Q8N1G8_HUMAN, ADAM7, SLC7A13_ENST00000419776, SMG6, PHF12, RNF213_ENST00000319921, RNF213_ENST00000411702, PSMG1, ZNF717, UGT2B10_ENST00000265403, UGT2B10, ...] Index: [] [0 rows x 20523 columns]
I tried to print the data and it's not empty.

Output:
sel NILAI_RESPON RESPON 0 NCI-H1703 0.0 0 1 RT4 0.0 0 2 NCI-SNU-16 665.128.619 0
Output:
sel NADK_ENST00000342348 CAMK1G NTF3 ANKRD30A_ENST00000602533 SFRP5 CNTN5_ENST00000524871 MICAL2_ENST00000379612 FCRL2 OTOGL_ENST00000458043 ... ZNF880 C12orf74 GDF5OS_ENST00000374375 PSME1_ENST00000382708 ICA1 NUTM2B PRSS36 GDF5OS SLC24A4_ENST00000532405 SPANXN1 0 HCC1806 1 1 1 1 1 1 1 1 1 ... 0 0 0 0 0 0 0 0 0 0 509 NCI-H358 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 1785 NCI-H3122 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 2161 NCI-H747 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 2435 T-24 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 20316 T47D 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 20337 NCI-H1437 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 20373 NCI-H23 0 0 1 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 20462 CAL-148 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 20561 NCI-H1299 0 0 0 1 0 0 0 0 0 ... 1 1 1 1 1 1 1 1 1 1 85 rows × 20521 columns