Python Forum
Merging spreadsheets with the same columns and extracting rows with matching entries
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Merging spreadsheets with the same columns and extracting rows with matching entries
#1
Hi All,

I have just begun learning Python for the last couple of months and I have got to grips with the very basics I think having no background in coding.

I want to build a script that:
1) Identifies the latest created folder in one of my document folders
2) Takes all the spreadsheets in the latest created folder and merges them into one single spreadsheet (they all have the same column headings) (previously I was using the excel "get data" function myself).
3) Extracts relevant rows from the merged spreadsheet by filtering for a list of URLs that I have compiled and outputting all these. i.e. output any rows that contain specific URLs

I do all this manually at the moment and I'm thinking the 30 mins I save doing this everyday could be put to a lot better use!

If anyone has any pointers/tips for getting started/ what I should be looking at to do this I would really appreciate it, thank you!

John B
Reply
#2
look for packages that you can use in your project, search here: https://pypi.org/
example openpyxl for reading and writing excel files:
search all openpyxl:
https://pypi.org/search/?q=openpyxl&o=
specific: https://pypi.org/project/openpyxl/
and github code docs: https://openpyxl.readthedocs.io/en/stable/
Reply
#3
(Aug-17-2021, 10:14 PM)Larz60+ Wrote: look for packages that you can use in your project, search here: https://pypi.org/
example openpyxl for reading and writing excel files:
search all openpyxl:
https://pypi.org/search/?q=openpyxl&o=
specific: https://pypi.org/project/openpyxl/
and github code docs: https://openpyxl.readthedocs.io/en/stable/

Thanks for the heads up that's really helpful! Appreciate it so much!

John
Reply
#4
import os
import pandas as pd 
cwd = os.path.abspath('K:\\ExternalLinkValidationReports\\UK\\2020-01-06')
files = os.listdir(cwd)
df = pd.DataFrame()
for file in files:
     if file.endswith('.xlsx'):
         df = df.append(pd.read_excel(file), ignore_index=True) 
df.head()
df.to_excel('Appended Reports.xlsx')
So far I've managed to create an appended report that collects and appends all the Excel files in K:\\ExternalLinkValidationReports\\UK\\2020-01-06

However, I want to modify this so that it looks for the latest modified folder in K:\\ExternalLinkValidationReports\\UK\\
rather than having to specify the exact folder each time.

I wonder if anyone knows a quick way to modify my code so that it can do this?

Many thanks!

John
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  making variables in my columns and rows in python kronhamilton 2 269 Oct-31-2021, 10:38 AM
Last Post: snippsat
  rows from sql query need to write to a file as columns sjcsvatt 6 413 Oct-09-2021, 12:45 AM
Last Post: snippsat
  Summing up rows and columns plumberpy 3 643 Aug-18-2021, 05:46 AM
Last Post: naughtyCat
  Partial Matching Rows In Pandas DataFrame Query eddywinch82 1 607 Jul-08-2021, 06:32 PM
Last Post: eddywinch82
  Pandas DataFrame combine rows by column value, where Date Rows are NULL rhat398 0 696 May-04-2021, 10:51 PM
Last Post: rhat398
  Indexing [::-1] to Reverse ALL 2D Array Rows, ALL 3D, 4D Array Columns & Rows Python Jeremy7 8 2,390 Mar-02-2021, 01:54 AM
Last Post: Jeremy7
  Pandas: how to split one row of data to multiple rows and columns in Python GerardMoussendo 4 2,368 Feb-22-2021, 06:51 PM
Last Post: eddywinch82
Question How to make a 3D List of Excel Spreadsheets? chatguy 4 978 Jan-24-2021, 05:24 AM
Last Post: buran
  How to split dataframe object rows to columns Mekala 1 896 Nov-12-2020, 04:18 PM
Last Post: michael1789
  Numpy Play with Rows and Columns of a Matix quest 6 1,170 Nov-05-2020, 09:08 PM
Last Post: quest

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020