Python Forum
Copying Data from Multiple XLS Files to a Cumulative Master File
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Copying Data from Multiple XLS Files to a Cumulative Master File
#1
I'm facing a challenge with the following: I have weekly files of sports results that I want to consolidate into a master file each week. The source is an Excel template (XLS format) that I download for each match. The source format has fixed tabs and each spreadsheet has a fixed format.

My goal is to place the new templates in a fixed folder, and have certain source data copied to my master file. Each time, the data should be copied into an empty cell below the previous match data.

What I want seems fairly basic, but the strange thing is that the Python code I wrote (with the help of Copilot) somehow cannot copy the first cells. For example, when I try to get the data from cell B1, Python strangely gets the info from B3. In total, I want to copy 4 fields, but everything shifts by one cell. I want to extract the following from the source file: Cell B1, B3, B4, and B6 These are vertical in the source file but I want to place them horizontally in my master file so that I can, for example, create an Excel Pivot table. However, it doesn't work to get the data from the first cell, B1, into my master file. Note, the info is vertical in the source file wed_cumula, but I want it horizontally in my master file.

Another challenge is that it doesn't work to read an XLS file; only when I convert the source file to an XLSX file can the Python script read and copy the file, but then it doesn't get the first cell B1. However, opening and saving the files as XLSX instead of XLS each time is not an option because I want to download many files in bulk and convert them to the master file.

I have put the Python script in a WordPad document attached. The source files are named wed_cumula (always XLS files when I download them) and I want to get the cells in the following order in my master data sheet horizontally: B2, B4, B1, B3, B6 And my master file is named BNXT_202425. I have included an example of how it should be and the cells that refer to the correct cell from the source file. In red, how I get it now, without cell B1, but also based on the source file as XLSX instead of XLS.

So the errors I notice are as follows: My script cannot read the XLS file, only XLSX and I miss the first cell, namely B1.

Who can help me get the correct script?

Thanks, Martijn

Attached Files

.xls   wed_cumula (4).xls (Size: 27.5 KB / Downloads: 1)
.xls   wed_cumula (3).xls (Size: 28 KB / Downloads: 0)
.xls   wed_cumula (2).xls (Size: 24 KB / Downloads: 1)
.txt   Python Code.txt (Size: 2.99 KB / Downloads: 1)
.xlsx   BNXT_202425.xlsx (Size: 15.41 KB / Downloads: 0)
Reply
#2
Using xlrd you can read the xls files, then use openpyxl to save as xlsx files.

import xlrd
import openpyxl

import xlrd
from openpyxl import Workbook

path2xl = '/home/pedro/Python_Virtual_Environments/openpyxl/xl_files/'
xls_file = 'wed_cumula_2.xls'
xlsx_file = 'wed_cumula_2.xlsx'

# Reading .xls file using xlrd
wb_xls = xlrd.open_workbook(path2xl + xls_file)
# Get the number of sheets
#num_sheets = wb.nsheets # returns 6 for this file
# Get the sheet names, more convenient
sheet_names = wb_xls.sheet_names()

# create an xlsx file in openpyxl
wb_xlsx = Workbook()

# make sheets in the xlsx file the same as the xls
for sheet in sheet_names:
    wb_xlsx.create_sheet(sheet)

# when wb_xlsx is created it has 1 sheet called Sheet.
# get rid of Sheet
wb_xlsx.remove(wb_xlsx['Sheet'])

# save the new xlsx file
wb_xlsx.save(path2xl + xlsx_file)
Now you have an xlsx file with the same sheet names as your xls file. What do you want to do now? Copy everything? Or what?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Import multiple CSV files into pandas Krayna 0 2,187 May-20-2021, 04:56 PM
Last Post: Krayna
  Loading multiple JSON files to create a csv 0LI5A3A 0 2,669 Jun-28-2020, 10:35 PM
Last Post: 0LI5A3A
  Binning data to files Kappel 4 3,333 Jun-22-2020, 06:25 PM
Last Post: Kappel
  Filter rows by multiple text conditions in another data frame i.e contains strings an Pan 0 2,592 Jun-09-2020, 06:05 AM
Last Post: Pan
  Least-squares fit multiple data sets multiverse22 1 2,926 Jun-06-2020, 01:38 AM
Last Post: Larz60+
  XLSX file with multiple sheets to josn file ovidius 2 3,090 Apr-05-2020, 09:22 AM
Last Post: ovidius
  how to fit an exponential function to multiple data python_newbie09 1 3,812 Sep-24-2019, 08:12 AM
Last Post: scidam
  Ask for machine learning Python example with 2 data files user5566b 2 2,931 Sep-05-2019, 12:15 PM
Last Post: user5566b
  Need Help With Filtering Data For Excel Files Using Pandas eddywinch82 9 9,030 Aug-06-2019, 03:44 PM
Last Post: eddywinch82
  Append Multiple CSV files Nidhesh 2 3,137 Jul-03-2019, 11:55 AM
Last Post: Nidhesh

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020