Problem of getting XLS file data into python - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Problem of getting XLS file data into python (/thread-1669.html) |
Problem of getting XLS file data into python - klllmmm - Jan-19-2017 [attachment=112][attachment=113] I'm trying to get the data from 2 XLS files into python. These XLS files were exported files from SAP system. However i can't get the data into python. https://drive.google.com/open?id=0B6D2lP-sz4CZX3lPWUlYdHFMdHM https://drive.google.com/open?id=0B6D2lP-sz4CZSEd0c0VoR054a1k Method 1 import glob import pandas as pd path = "C:/Users/Desktop/MJ" file_identifier = "*.XLS" all_data = pd.DataFrame() for f in glob.glob(path + "/*" + file_identifier): df = pd.read_excel(f) all_data = all_data.append(df,ignore_index=True)No data comes to python variable all_data Method 2 Also i tried through xlrd import xlrd workbook = xlrd.open_workbook('C:/Users/Desktop/MJ/Apr.xls') Can someone help me to solve this.
RE: Problem of getting XLS file data into python - buran - Jan-19-2017 Are you sure the path C:/Users/Desktop/MJ is correct?I think between Users and Desktop should be the username that is logged in the system. i.e. C:/Users/<Some User Name>/Desktop/MJ assuming MJ is folder on the Desktop
RE: Problem of getting XLS file data into python - snippsat - Jan-19-2017 As mention bye @buran the path is wrong. The file is also not good had to save in Excel(which complain that file is corrupt). So save it to test.xlsx ,now it work in xlrd(as you don't need pandas can read excel).E.g pandas.read_excel() >>> import pandas >>> df = pandas.read_excel('test.xlsx') >>> df.head() Unnamed: 0 Data statistics Unnamed: 2 Number of Unnamed: 4 \ 0 NaN NaN NaN NaN NaN 1 NaN Records passed NaN 3.079 NaN 2 NaN NaN NaN NaN NaN 3 NaN NaN NaN NaN NaN 4 NaN JOURNAL_NUMBER DOCUMENT TYPE NaN POSTING KEY Unnamed: 5 Unnamed: 6 Unnamed: 7 Unnamed: 8 \ 0 NaN NaN NaN NaN 1 NaN NaN NaN NaN 2 NaN NaN NaN NaN 3 NaN NaN NaN NaN 4 GL_ACCT_NUMBER DESCRIPTION DEBIT/CREDIT AMOUNT Unnamed: 9 Unnamed: 10 Unnamed: 11 Unnamed: 12 0 NaN NaN NaN NaN 1 NaN NaN NaN NaN 2 NaN NaN NaN NaN 3 NaN NaN NaN NaN 4 EFFECTIVE_DATE ENTRY_DATE PERIOD PREPARER_ID >>> df['Data statistics'][:15] 0 NaN 1 Records passed 2 NaN 3 NaN 4 JOURNAL_NUMBER 5 NaN 6 1003001182 7 1003001182 8 1003001183 9 1003001183 10 1003001184 11 1003001184 12 1003001184 13 1003001184 14 1003001185 Name: Data statistics, dtype: object RE: Problem of getting XLS file data into python - klllmmm - Jan-19-2017 Thanks for your help. pd.read_excel() works. Now i have to find a way to convert/ save as xls files to xlsx. ( abt 100+ files) Thank you very much. |