Python Forum
Errors if an xlsx file has blank rows in the beginning….
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Errors if an xlsx file has blank rows in the beginning….
#1
Greetings!
This is my first try using Pandas.
The sniped below works file if the files(.XLSX) do not have blank lines in the beginning.
I thought by using 'dropna' I'll fix the problem but it does not:
exlf = exlf.dropna(axis = 0, how = 'all')
Snipped:
import pandas as pd

exlf = pd.read_excel('C:/01/XLS_Files/SHORT_2.xlsx')
exlf = exlf.dropna(axis = 0, how = 'all')
#print(f" Blank Row Removed.. {exlf}")
four_col = exlf.iloc[: , [0,4,9,17]].copy()    # <--- Filtering 4 Columns
#print('\n. Only 4(Four) Columns \n :',four_col) 

filtered_l = four_col[(four_col['VR Site'] =='ORVCC') & (four_col['VR Duration(Days)'] >=449)]
print(filtered_l)
Any help is appreciated.
Reply
#2
What do you mean by "fix", and how is your code not fixing the problem?

One problem I can see is that you go from using column locations to using column names. If you know column names, why use iloc? If you don't know column names, why not?
four_col = exlf.iloc[: , [0,4,9,17]].copy()    # using location
#print('\n. Only 4(Four) Columns \n :',four_col) 
 
filtered_l = four_col[(four_col['VR Site'] =='ORVCC') & (four_col['VR Duration(Days)'] >=449)]  # Using column names
Is not knowing column names the problem, or is it that the dataframe does not have column names? (same problem as here?)

https://python-forum.io/thread-40610.html

Why are you using copy()? This creates a new DataFrame. No need to make a copy.
exlf.iloc[: , [0,4,9,17]]
As you can see here:
import pandas as pd
from random import choices

df = pd.DataFrame({c: choices(range(1, 101), k=5) for c in "ABCDEFGH"})
df2 = df.iloc[:, [1, 3, 5, 7]]
print(df2)
Output:
B D F H 0 10 32 21 49 1 60 82 8 25 2 32 62 64 96 3 98 93 100 66 4 13 73 21 74
And this shows selecting columns using column names.
import pandas as pd
from random import choices

df = pd.DataFrame({c: choices(range(1, 101), k=5) for c in "ABCDEFGH"})
df2 = df[["B", "D", "F", "H"]]
print(df2)
Output:
B D F H 0 71 27 84 100 1 45 39 96 43 2 53 20 75 86 3 95 82 16 83 4 15 8 56 29
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Export dataframe to xlsx - Error "zipfile.BadZipFile: File is not a zip file" Baggio 10 62,421 Mar-12-2021, 01:02 PM
Last Post: buran
  Python - Pandas writing blank files to file tdunphy 0 2,011 Jan-14-2021, 12:11 AM
Last Post: tdunphy
  XLSX file with multiple sheets to josn file ovidius 2 2,257 Apr-05-2020, 09:22 AM
Last Post: ovidius
  How to copy a .csv worksheet into a .xlsx file without the number values turning into YoshikageKira 7 3,591 Mar-28-2020, 10:38 AM
Last Post: buran
  Write specific rows from pandas dataframe to csv file pradeepkumarbe 3 5,534 Oct-18-2018, 09:33 PM
Last Post: volcano63
  How to filter specific rows from large data file Ariane 7 8,237 Jun-29-2018, 02:43 PM
Last Post: gontajones
  Write data into existing Excel (xlsx) file with multiple sheets BNB 1 15,361 Jun-01-2017, 04:22 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020