Python Forum
Problem of getting XLS file data into python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Problem of getting XLS file data into python
#1

.xls   May.XLS (Size: 50.75 KB / Downloads: 206)
.xls   Apr.XLS (Size: 54.54 KB / Downloads: 199)
I'm trying to get the data from 2 XLS files into python. These XLS files were exported files from SAP system.

However i can't get the data into python.

https://drive.google.com/open?id=0B6D2lP...UlYdHFMdHM

https://drive.google.com/open?id=0B6D2lP...0VoR054a1k

Method 1

import glob
import pandas as pd

path = "C:/Users/Desktop/MJ"
file_identifier = "*.XLS"

all_data = pd.DataFrame()
for f in glob.glob(path + "/*" + file_identifier):
    df = pd.read_excel(f)
    all_data = all_data.append(df,ignore_index=True)

No data comes to python variable all_data

Method 2
Also i tried through xlrd

import xlrd

workbook = xlrd.open_workbook('C:/Users/Desktop/MJ/Apr.xls')
Error:
Traceback (most recent call last):   File "<ipython-input-14-9f59ece120e8>", line 2, in <module>     workbook = xlrd.open_workbook('C:/Users/Gishan/Desktop/MJ/Apr.xls')   File "C:\Program Files\Anaconda3\lib\site-packages\xlrd\__init__.py", line 395, in open_workbook     with open(filename, "rb") as f: FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/Gishan/Desktop/MJ/Apr.xls'
Can someone help me to solve this.
Reply
#2
Are you sure the path C:/Users/Desktop/MJ is correct?
I think between Users and Desktop should be the username that is logged in the system. i.e. C:/Users/<Some User Name>/Desktop/MJ assuming MJ is folder on the Desktop
Reply
#3
As mention bye @buran the path is wrong.
The file is also not good had to save in Excel(which complain that file is corrupt).
So save it to test.xlsx,now it work in xlrd(as you don't need pandas can read excel).
E.g pandas.read_excel()
>>> import pandas

>>> df = pandas.read_excel('test.xlsx')
>>> df.head()
  Unnamed: 0 Data statistics     Unnamed: 2  Number of   Unnamed: 4  \
0         NaN             NaN            NaN        NaN          NaN   
1         NaN  Records passed            NaN      3.079          NaN   
2         NaN             NaN            NaN        NaN          NaN   
3         NaN             NaN            NaN        NaN          NaN   
4         NaN  JOURNAL_NUMBER  DOCUMENT TYPE        NaN  POSTING KEY   

      Unnamed: 5   Unnamed: 6    Unnamed: 7          Unnamed: 8  \
0             NaN          NaN           NaN                 NaN   
1             NaN          NaN           NaN                 NaN   
2             NaN          NaN           NaN                 NaN   
3             NaN          NaN           NaN                 NaN   
4  GL_ACCT_NUMBER  DESCRIPTION  DEBIT/CREDIT              AMOUNT   

      Unnamed: 9 Unnamed: 10 Unnamed: 11  Unnamed: 12  
0             NaN         NaN         NaN          NaN  
1             NaN         NaN         NaN          NaN  
2             NaN         NaN         NaN          NaN  
3             NaN         NaN         NaN          NaN  
4  EFFECTIVE_DATE  ENTRY_DATE      PERIOD  PREPARER_ID  

>>> df['Data statistics'][:15]
0                NaN
1     Records passed
2                NaN
3                NaN
4     JOURNAL_NUMBER
5                NaN
6         1003001182
7         1003001182
8         1003001183
9         1003001183
10        1003001184
11        1003001184
12        1003001184
13        1003001184
14        1003001185
Name: Data statistics, dtype: object
Reply
#4
Thanks for your help. pd.read_excel() works.

Now i have to find a way to convert/ save as xls files to xlsx. ( abt 100+ files)

Thank you very much.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  how to handling time series data file with Python? aupres 4 483 Aug-10-2020, 12:40 PM
Last Post: MattKahn13
  Python numpy fft from data file magnet1 1 523 Feb-06-2020, 07:30 AM
Last Post: magnet1
  Colormap data from file - pcolormesh problem alrm31015 0 928 May-19-2019, 10:20 PM
Last Post: alrm31015
  Problem with saving data and loading data to mysql kirito85 4 1,190 Feb-08-2019, 10:53 AM
Last Post: kirito85
  Extract data between two dates from a .csv file using Python 2.7 sujai_banerji 1 6,715 Nov-15-2017, 09:48 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020